-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before asking
- I had searched in the issues and found no similar issues.
Version
4.0.2
What's Wrong?
This bug happens after upgrade doris from 4.0.0 to 4.0.2.
We observed abnormal thread growth in Doris FE. The number of threads continuously increases until it reaches more than 130,000, which eventually exhausts system memory and leads to os::commit_memory failed errors.
Most of these threads are named sdk-ScheduledExecutor-* and are in WAITING (parking) state. They are created by ScheduledThreadPoolExecutor and remain idle, waiting on DelayedWorkQueue.take().
What You Expected?
Thread pools should be reused and limited in size.
Idle ScheduledThreadPoolExecutor threads should not grow indefinitely.
How to Reproduce?
Start Doris FE with JDK 17.
Monitor thread count using jstack or jcmd.
Observe continuous growth of threads named sdk-ScheduledExecutor-*.
Eventually, memory usage exceeds physical RAM and Doris FE crashes.
Anything Else?
Example thread dump:
"sdk-ScheduledExecutor-3296-3" #13890 daemon prio=5 os_prio=0 cpu=0.12ms elapsed=278.37s tid=0x00007fa6055380b0 nid=0x3e51 waiting on condition [0x00007fa2959d2000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base@17.0.2/Native Method)
- parking to wait for <0x0000000213a14f60> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base@17.0.2/LockSupport.java:341)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(java.base@17.0.2/AbstractQueuedSynchronizer.java:506)
at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@17.0.2/ForkJoinPool.java:3463)
at java.util.concurrent.ForkJoinPool.managedBlock(java.base@17.0.2/ForkJoinPool.java:3434)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@17.0.2/AbstractQueuedSynchronizer.java:1623)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@17.0.2/ScheduledThreadPoolExecutor.java:1177)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(java.base@17.0.2/ScheduledThreadPoolExecutor.java:899)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@17.0.2/ThreadPoolExecutor.java:1062)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.2/ThreadPoolExecutor.java:1122)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.2/ThreadPoolExecutor.java:635)
at java.lang.Thread.run(java.base@17.0.2/Thread.java:833)
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct