Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Coordinator crashes with OOM in 468 #24572

Open
guyco33 opened this issue Dec 24, 2024 · 3 comments
Open

Regression: Coordinator crashes with OOM in 468 #24572

guyco33 opened this issue Dec 24, 2024 · 3 comments
Labels
bug Something isn't working RELEASE-BLOCKER

Comments

@guyco33
Copy link
Member

guyco33 commented Dec 24, 2024

Coordinator in 468 consumes much more memory than 467 that leads to OS OOM.
What could be the reason for such memory consumption change ?

Listed below the output from free memory stats and gc logs during execution from both coordinator versions.

467
===
** after 15 minutes 

[2024-12-24T11:10:18.671 0000][7691][info][gc,start       ] GC(60) Pause Young (Normal) (G1 Evacuation Pause)
[2024-12-24T11:10:18.672 0000][7691][info][gc,task        ] GC(60) Using 4 workers of 4 for evacuation
[2024-12-24T11:10:18.834 0000][7691][info][gc,phases      ] GC(60)   Pre Evacuate Collection Set: 0.87ms
[2024-12-24T11:10:18.834 0000][7691][info][gc,phases      ] GC(60)   Merge Heap Roots: 0.44ms
[2024-12-24T11:10:18.834 0000][7691][info][gc,phases      ] GC(60)   Evacuate Collection Set: 159.86ms
[2024-12-24T11:10:18.834 0000][7691][info][gc,phases      ] GC(60)   Post Evacuate Collection Set: 1.28ms
[2024-12-24T11:10:18.834 0000][7691][info][gc,phases      ] GC(60)   Other: 0.12ms
[2024-12-24T11:10:18.834 0000][7691][info][gc,heap        ] GC(60) Eden regions: 100->0(89)
[2024-12-24T11:10:18.834 0000][7691][info][gc,heap        ] GC(60) Survivor regions: 8->9(14)
[2024-12-24T11:10:18.834 0000][7691][info][gc,heap        ] GC(60) Old regions: 34->34
[2024-12-24T11:10:18.834 0000][7691][info][gc,heap        ] GC(60) Humongous regions: 0->0
[2024-12-24T11:10:18.834 0000][7691][info][gc,metaspace   ] GC(60) Metaspace: 291741K(295488K)->291741K(295488K) NonClass: 254345K(256448K)->254345K(256448K) Class: 37395K(39040K)->37395K(39040K)
[2024-12-24T11:10:18.834 0000][7691][info][gc             ] GC(60) Pause Young (Normal) (G1 Evacuation Pause) 4449M->1279M(16384M) 162.817ms
[2024-12-24T11:10:18.834 0000][7691][info][gc,cpu         ] GC(60) User=0.43s Sys=0.04s Real=0.17s
               total        used        free      shared  buff/cache   available
Mem:              30           7          19           0           4          23
Swap:              0           0           0
Tue Dec 24 11:10:26 UTC 2024


** after 1:15 hour

[2024-12-24T12:09:59.680 0000][7686][info][gc,marking     ] GC(173) Concurrent Mark From Roots 2367.862ms
[2024-12-24T12:09:59.680 0000][7686][info][gc,marking     ] GC(173) Concurrent Preclean
[2024-12-24T12:09:59.681 0000][7686][info][gc,marking     ] GC(173) Concurrent Preclean 1.161ms
[2024-12-24T12:09:59.681 0000][7691][info][gc,start       ] GC(173) Pause Remark
[2024-12-24T12:09:59.724 0000][7691][info][gc             ] GC(173) Pause Remark 5988M->5691M(16384M) 42.204ms
[2024-12-24T12:09:59.724 0000][7691][info][gc,cpu         ] GC(173) User=0.15s Sys=0.00s Real=0.04s
[2024-12-24T12:09:59.726 0000][7686][info][gc,marking     ] GC(173) Concurrent Mark 2413.972ms
[2024-12-24T12:09:59.726 0000][7686][info][gc,marking     ] GC(173) Concurrent Rebuild Remembered Sets and Scrub Regions
[2024-12-24T12:10:00.210 0000][7686][info][gc,marking     ] GC(173) Concurrent Rebuild Remembered Sets and Scrub Regions 483.491ms
[2024-12-24T12:10:00.210 0000][7691][info][gc,start       ] GC(173) Pause Cleanup
[2024-12-24T12:10:00.210 0000][7691][info][gc             ] GC(173) Pause Cleanup 5725M->5725M(16384M) 0.054ms
[2024-12-24T12:10:00.210 0000][7691][info][gc,cpu         ] GC(173) User=0.00s Sys=0.00s Real=0.00s
[2024-12-24T12:10:00.210 0000][7686][info][gc,marking     ] GC(173) Concurrent Clear Claimed Marks
[2024-12-24T12:10:00.210 0000][7686][info][gc,marking     ] GC(173) Concurrent Clear Claimed Marks 0.427ms
[2024-12-24T12:10:00.210 0000][7686][info][gc,marking     ] GC(173) Concurrent Cleanup for Next Mark
[2024-12-24T12:10:00.228 0000][7686][info][gc,marking     ] GC(173) Concurrent Cleanup for Next Mark 17.547ms
[2024-12-24T12:10:00.228 0000][7686][info][gc             ] GC(173) Concurrent Mark Cycle 3285.446ms
               total        used        free      shared  buff/cache   available
Mem:              30          16          10           0           4          14
Swap:              0           0           0
Tue Dec 24 12:10:41 UTC 2024

===
468
===
** after 15 minutes 

[2024-12-24T11:10:04.019 0000][7679][info][gc,start       ] GC(62) Pause Young (Normal) (G1 Evacuation Pause)
[2024-12-24T11:10:04.019 0000][7679][info][gc,task        ] GC(62) Using 4 workers of 4 for evacuation
[2024-12-24T11:10:04.161 0000][7679][info][gc,phases      ] GC(62)   Pre Evacuate Collection Set: 0.32ms
[2024-12-24T11:10:04.161 0000][7679][info][gc,phases      ] GC(62)   Merge Heap Roots: 0.22ms
[2024-12-24T11:10:04.161 0000][7679][info][gc,phases      ] GC(62)   Evacuate Collection Set: 139.19ms
[2024-12-24T11:10:04.161 0000][7679][info][gc,phases      ] GC(62)   Post Evacuate Collection Set: 2.22ms
[2024-12-24T11:10:04.161 0000][7679][info][gc,phases      ] GC(62)   Other: 0.08ms
[2024-12-24T11:10:04.161 0000][7679][info][gc,heap        ] GC(62) Eden regions: 118->0(104)
[2024-12-24T11:10:04.161 0000][7679][info][gc,heap        ] GC(62) Survivor regions: 8->10(16)
[2024-12-24T11:10:04.161 0000][7679][info][gc,heap        ] GC(62) Old regions: 33->33
[2024-12-24T11:10:04.161 0000][7679][info][gc,heap        ] GC(62) Humongous regions: 0->0
[2024-12-24T11:10:04.161 0000][7679][info][gc,metaspace   ] GC(62) Metaspace: 292209K(296000K)->292209K(296000K) NonClass: 254676K(256768K)->254676K(256768K) Class: 37533K(39232K)->37533K(39232K)
[2024-12-24T11:10:04.161 0000][7679][info][gc             ] GC(62) Pause Young (Normal) (G1 Evacuation Pause) 5009M->1277M(16384M) 142.330ms
[2024-12-24T11:10:04.162 0000][7679][info][gc,cpu         ] GC(62) User=0.44s Sys=0.04s Real=0.14s
               total        used        free      shared  buff/cache   available
Mem:              30          16           9           0           4          13
Swap:              0           0           0
Tue Dec 24 11:10:27 UTC 2024


** after 1:15 hour

[2024-12-24T12:10:12.713 0000][7679][info][gc,start       ] GC(179) Pause Young (Normal) (G1 Evacuation Pause)
[2024-12-24T12:10:12.714 0000][7679][info][gc,task        ] GC(179) Using 4 workers of 4 for evacuation
[2024-12-24T12:10:12.795 0000][7679][info][gc,phases      ] GC(179)   Pre Evacuate Collection Set: 0.34ms
[2024-12-24T12:10:12.795 0000][7679][info][gc,phases      ] GC(179)   Merge Heap Roots: 0.24ms
[2024-12-24T12:10:12.795 0000][7679][info][gc,phases      ] GC(179)   Evacuate Collection Set: 77.50ms
[2024-12-24T12:10:12.795 0000][7679][info][gc,phases      ] GC(179)   Post Evacuate Collection Set: 2.34ms
[2024-12-24T12:10:12.795 0000][7679][info][gc,phases      ] GC(179)   Other: 0.09ms
[2024-12-24T12:10:12.795 0000][7679][info][gc,heap        ] GC(179) Eden regions: 303->0(302)
[2024-12-24T12:10:12.795 0000][7679][info][gc,heap        ] GC(179) Survivor regions: 4->5(39)
[2024-12-24T12:10:12.795 0000][7679][info][gc,heap        ] GC(179) Old regions: 113->113
[2024-12-24T12:10:12.795 0000][7679][info][gc,heap        ] GC(179) Humongous regions: 1->1
[2024-12-24T12:10:12.795 0000][7679][info][gc,metaspace   ] GC(179) Metaspace: 296387K(300352K)->296387K(300352K) NonClass: 258630K(260800K)->258630K(260800K) Class: 37756K(39552K)->37756K(39552K)
[2024-12-24T12:10:12.795 0000][7679][info][gc             ] GC(179) Pause Young (Normal) (G1 Evacuation Pause) 13361M->3699M(16384M) 82.112ms
[2024-12-24T12:10:12.795 0000][7679][info][gc,cpu         ] GC(179) User=0.26s Sys=0.01s Real=0.08s
               total        used        free      shared  buff/cache   available
Mem:              30          29           1           0           0           0
Swap:              0           0           0
Tue Dec 24 12:10:25 UTC 2024


*** syslog OOM output 
grep -i oom /var/log/syslog

2024-12-24T12:14:35.481240 00:00 ip-172-30-20-32 kernel: http-worker-691 invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
2024-12-24T12:14:35.496433 00:00 ip-172-30-20-32 kernel:  oom_kill_process 0x116/0x270
2024-12-24T12:14:35.496434 00:00 ip-172-30-20-32 kernel:  ? oom_evaluate_task 0x143/0x1e0
2024-12-24T12:14:35.496435 00:00 ip-172-30-20-32 kernel:  __alloc_pages_may_oom 0x10c/0x1d0
2024-12-24T12:14:35.511859 00:00 ip-172-30-20-32 kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
2024-12-24T12:14:35.511912 00:00 ip-172-30-20-32 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/cloud-final.service,task=trino-server,pid=7634,uid=0
2024-12-24T12:14:35.511912 00:00 ip-172-30-20-32 kernel: Out of memory: Killed process 7634 (trino-server) total-vm:36015048kB, anon-rss:29973656kB, file-rss:2948kB, shmem-rss:0kB, UID:0 pgtables:61856kB oom_score_adj:0
2024-12-24T12:14:35.792898 00:00 ip-172-30-20-32 systemd[1]: cloud-final.service: A process of this unit has been killed by the OOM killer.
2024-12-24T12:14:35.803968 00:00 ip-172-30-20-32 systemd[1]: cloud-final.service: Failed with result 'oom-kill'.
2024-12-24T12:14:35.806272 00:00 ip-172-30-20-32 systemd[1]: system.slice: A process of this unit has been killed by the OOM killer.

Trino coordinator is running on EC2 machine type r6a.xlarge with OpenJDK Runtime Environment Temurin-23.0.1 11 (build 23.0.1 11)

jvm.conf is

-server
-Xmx16G
-Xms16G
-XX: UseG1GC
-XX:G1HeapRegionSize=32M
-XX: ExplicitGCInvokesConcurrent
-XX: ExitOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
-Djava.io.tmpdir=/data/tmpdir
-XX:ReservedCodeCacheSize=512M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Dfile.encoding=UTF-8
-Dcom.sun.jndi.ldap.object.disableEndpointIdentification=true
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
# Reduce starvation of threads by GClocker, recommend to set about the number of cpu cores (JDK-8192647)
-XX: UnlockDiagnosticVMOptions
-XX:G1NumCollectionsKeepPinned=10000000
-XX:GCLockerRetryAllocationCount=32
-Xlog:gc*:/data/trino/var/log/gc.log:time,level,tags,tid:filecount=10,filesize=100M
@guyco33
Copy link
Member Author

guyco33 commented Dec 24, 2024

Executing same load on r6a.2xlarge with double memory size and it seems that OOM will happen soon.

Memory consumption after 1.5 hours:

===
467
===

$ date && free -g && (top -n 20| grep trino) && date && free -g

Tue Dec 24 15:00:31 UTC 2024
               total        used        free      shared  buff/cache   available
Mem:              61          24          32           0           4          36
Swap:              0           0           0

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME  COMMAND
   7766 root      20   0   44.6g  23.0g  30676 S  18.2  37.4  35:48.76 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  20.3  37.4  35:49.37 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  18.6  37.4  35:49.93 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  16.3  37.4  35:50.42 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  15.0  37.4  35:50.87 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  16.6  37.4  35:51.37 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  12.3  37.4  35:51.74 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  11.3  37.4  35:52.08 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  13.7  37.4  35:52.49 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  12.3  37.4  35:52.86 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S  11.0  37.4  35:53.19 trino-s 
   7766 root      20   0   44.6g  23.0g  30676 S   7.6  37.4  35:53.42 trino-s 
   7766 root      20   0   44.5g  23.0g  30676 S   7.3  37.4  35:53.64 trino-s 
   7766 root      20   0   44.5g  23.0g  30676 S   6.0  37.4  35:53.82 trino-s 
   7766 root      20   0   44.5g  23.0g  30676 S   6.6  37.4  35:54.02 trino-s 
   7766 root      20   0   44.5g  23.0g  30676 S   6.6  37.4  35:54.22 trino-s 
   7766 root      20   0   44.4g  23.0g  30676 S   8.0  37.4  35:54.46 trino-s 
   7766 root      20   0   44.4g  23.0g  30676 S   6.3  37.4  35:54.65 trino-s 
   7766 root      20   0   44.4g  23.0g  30676 S   7.0  37.4  35:54.86 trino-s 
   7766 root      20   0   44.6g  23.1g  30676 S 278.7  37.6  36:03.25 trino-s 
Tue Dec 24 15:01:29 UTC 2024
===
468
===

$ date && free -g && (top -n 20| grep trino) && date && free -g

Tue Dec 24 15:00:30 UTC 2024
               total        used        free      shared  buff/cache   available
Mem:              61          43          13           0           4          17
Swap:              0           0           0

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME  COMMAND
   7784 root      20   0   61.7g  42.1g  30820 S  10.0  68.6  39:12.02 trino-s 
   7784 root      20   0   61.7g  42.2g  30820 S   8.3  68.6  39:12.27 trino-s 
   7784 root      20   0   61.8g  42.2g  30820 S   7.0  68.7  39:12.48 trino-s 
   7784 root      20   0   61.8g  42.2g  30820 S  10.3  68.7  39:12.79 trino-s 
   7784 root      20   0   61.9g  42.2g  30820 S   7.3  68.7  39:13.01 trino-s 
   7784 root      20   0   61.9g  42.3g  30820 S   5.6  68.8  39:13.18 trino-s 
   7784 root      20   0   61.9g  42.3g  30820 S   5.7  68.8  39:13.35 trino-s 
   7784 root      20   0   61.8g  42.3g  30820 S   5.0  68.9  39:13.50 trino-s 
   7784 root      20   0   61.8g  42.3g  30820 S   3.0  68.9  39:13.59 trino-s 
   7784 root      20   0   61.8g  42.3g  30820 S   2.0  68.9  39:13.65 trino-s 
   7784 root      20   0   61.8g  42.3g  30820 S   2.3  68.8  39:13.72 trino-s 
   7784 root      20   0   61.8g  42.3g  30820 S   1.7  68.8  39:13.77 trino-s 
   7784 root      20   0   61.8g  42.2g  30820 S   2.0  68.7  39:13.83 trino-s 
   7784 root      20   0   61.8g  42.2g  30820 S   2.0  68.7  39:13.89 trino-s 
   7784 root      20   0   62.0g  42.4g  30820 S 181.4  69.0  39:19.35 trino-s 
   7784 root      20   0   62.9g  43.0g  30820 S 220.7  70.0  39:25.97 trino-s 
   7784 root      20   0   63.4g  43.5g  30820 S  41.2  70.8  39:27.21 trino-s 
   7784 root      20   0   63.8g  44.0g  30820 S  43.9  71.6  39:28.53 trino-s 
   7784 root      20   0   64.6g  44.7g  30820 S 155.0  72.7  39:33.18 trino-s 
   7784 root      20   0   64.9g  45.1g  30820 S  34.6  73.3  39:34.22 trino-s 
Tue Dec 24 15:01:27 UTC 2024
               total        used        free      shared  buff/cache   available
Mem:              61          46          10           0           4          14
Swap:              0           0           0

@guyco33 guyco33 added bug Something isn't working RELEASE-BLOCKER labels Dec 24, 2024
@guyco33
Copy link
Member Author

guyco33 commented Dec 24, 2024

OS OOM happens again

$ grep oom /var/log/syslog

2024-12-24T17:14:34.706917 00:00 ip-172-30-21-101 kernel: query-execution invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
2024-12-24T17:14:34.744709 00:00 ip-172-30-21-101 kernel:  oom_kill_process 0x116/0x270
2024-12-24T17:14:34.744711 00:00 ip-172-30-21-101 kernel:  ? oom_evaluate_task 0x143/0x1e0
2024-12-24T17:14:34.744712 00:00 ip-172-30-21-101 kernel:  __alloc_pages_may_oom 0x10c/0x1d0
2024-12-24T17:14:34.792485 00:00 ip-172-30-21-101 kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
2024-12-24T17:14:34.792510 00:00 ip-172-30-21-101 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/cloud-final.service,task=trino-server,pid=7784,uid=0
2024-12-24T17:14:34.792511 00:00 ip-172-30-21-101 kernel: Out of memory: Killed process 7784 (trino-server) total-vm:74788372kB, anon-rss:63005676kB, file-rss:2412kB, shmem-rss:0kB, UID:0 pgtables:128700kB oom_score_adj:0
2024-12-24T17:14:37.446266 00:00 ip-172-30-21-101 kernel: oom_reaper: reaped process 7784 (trino-server), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
2024-12-24T17:14:37.759101 00:00 ip-172-30-21-101 systemd[1]: cloud-final.service: Failed with result 'oom-kill'.

@wendigo
Copy link
Contributor

wendigo commented Dec 24, 2024

@guyco33 can you do a heapdump when gc pressure goes up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working RELEASE-BLOCKER
Development

No branches or pull requests

2 participants