HDDS-14106. Add -XX:NewRatio=3 to default GC options for CMS#9967
HDDS-14106. Add -XX:NewRatio=3 to default GC options for CMS#9967adoroszlai merged 2 commits intoapache:masterfrom
Conversation
| if [[ "$java_major_version" -lt 15 ]]; then | ||
| OZONE_OPTS="${OZONE_OPTS} -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled" | ||
| ozone_error "No '-XX:...' jvm parameters are set. Adding safer GC settings '-XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled' to the OZONE_OPTS" | ||
| OZONE_OPTS="${OZONE_OPTS} -XX:+UseConcMarkSweepGC -XX:NewRatio=3 -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled" |
There was a problem hiding this comment.
what would be the equivalent for G1GC?
There was a problem hiding this comment.
Happily there isn't one. The root cause of the problem is an ergonomic detail unique to CMS, from which G1GC does not suffer. This isn't even a bug, it is intended behavior, just from 2004 when there was nothing like modern scale challenges or the individual server resources to meet them.
If we switch to the question of "can G1GC performance be improved with this property" the answer is no as I understand it; G1GC depends on adaptive sizing of the young gen for meeting the pause goals that are set. Any property that fixes the young gen heap size, whether NewRatio, Xmn, or NewSize=MaxNewSize, would stop that mechanism from working. This wouldn't break G1GC, but it would likely mean that we would start consistently exceeding its pause target above some threshold.
|
Thanks @rnblough for the patch, @yandrey321 for the review. |
What changes were proposed in this pull request?
I propose that the NewRatio value be specified in java options of all Ozone roles by default when -XX:+UseConcMarkSweepGC is set, to solve the long-standing problem of ConcurrentMarkSweep GC always having a tiny Young Generation heap size. The consequence of the tiny Young Generation heap size is ParNew thrashing, and premature object promotion polluting the Old Gen and eventually driving unnecessary full GC. That part of the problem is straightforwardly diagnosable with GC logs and heap dumps, and it has been pretty common in Hadoop deployments generally to address this problem using -XX:NewSize and -XX:MaxNewSize or -Xmn as cluster sizes grew; the fact that there was a consistent underlying driver through JDK ergonomics that can be trivially compensated for is the insight here.
This primarily impacts larger deployments, particularly where lists of millions of objects like keys or containerIDs becomes routine even through internal reporting mechanisms.
This behavior was introduced deliberately in the JDK ergonomics. The earliest complaints about the behavior I encountered are from JDK6: https://bugs.openjdk.org/browse/JDK-6872335
But it looks like it was actually introduced before that, based on this doc describing GC tuning changes for J2SE 5.0: https://docs.oracle.com/javase/1.5.0/docs/guide/vm/gc-ergonomics.html
The choice of -XX:NewRatio=3 instead of the default value of 2 comes down to the observation that Ozone does not require a young generation heap size that is 1/3 of the total heap (among other things, most Ozone deployments have worked fine even with the artificially tiny value), and to the fact that NewRatio will automatically adjust in tandem with heap size adjustments as opposed to something like -Xmn that would need to be recalculated every time or left static and require future manual adjustment.
Impacts to running clusters: I have observed one occasion where configuring a larger Young Generation heap size did result in ParNew collections taking a substantially longer time, on an SCM where -Xmx200g. This was noticeable when looking at the logs, and was detectable in some client interactions, but there were no further impacts. In every prod cluster I have seen where this change has been implemented from ~100g on down, no negative impacts observed at all.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14106
How was this patch tested?
Manual testing, successful deployment in production clusters, build-branch on fork. Two tests failed, but they are not germane to the change and appear to be do to a config issue in the integration (container) setup.
org.apache.hadoop.ozone.container.diskbalancer.TestDefaultContainerChoosingPolicy
org.apache.hadoop.ozone.container.diskbalancer.TestDefaultVolumeChoosingPolicy