HDFS Administration
Also available as:
PDF
loading table of contents...

Chapter 10. G1GC Garbage Collector Configuration (Technical Preview)

The Oracle JDK 7 update 4 introduces a new Garbage Collector (GC) – the Garbage First (G1) GC. This document provides recommended settings for G1GC parameters vs. the currently used Concurrent Mark Sweep (CMS) GC.

[Note]Note

This feature is a technical preview and considered under development. Do not use this feature in your production systems. If you have questions regarding this feature, contact Support by logging a case on our Hortonworks Support Portal at https://support.hortonworks.com.

Recommended Settings for G1GC

Based on initial testing, there appears to be no significant improvement in the NameNode startup process when using G1GC rather than CMS. The following NameNode settings are recommended for G1GC in a large cluster:

  • Approximately 10% more Java heap space (-XX:Xms and -XX:Xmx) should be allocated to the NameNode, as compared to CMS setup. Recommendations for setting CMS heap size are described in Configuring NameNode Heap Size.

  • For large clusters (>50M files), MaxGCPauseMillis should be set to 4000.

  • You should set ParallelGCThreads to 20 (default for a 32-core machine), as opposed to 8 for CMS.

  • Other G1GC parameters should be left set to their default values.

We have observed that the G1GC does not comply with the maximum heap size (-XX:Xmx) setting. For Xmx = 110 GB, we observed the following VM statistics:

  • For CMS: Maximum heap (VmPeak) = 113 GB.

  • For G1GC: Maximum heap (VmPeak) = 147 GB.

Configuration Settings for G1GC

To switch from CMS to G1GC, you must update the HADOOP_NAMENODE_OPTS settings in the hadoop-env.sh file. On the Ambari dashboard, select HDFS > Configs > Advanced > Advanced hadoop-env, then make the following changes to the HADOOP_NAMENODE_OPTS settings:

  • Replace -XX:+UseConcMarkSweepGC with -XX:+UseG1GC

  • Remove -XX:+UseCMSInitiatingOccupancyOnly and -XX:CMSInitiatingOccupancyFraction=####

  • Remove -XX:NewSize=#### and -XX:MaxNewSize=####

  • Add -XX:MaxGCPauseMillis=#### (optional)

  • Add -XX:InitiatingHeapOccupancyPercent=#### (optional)

  • Add -XX:ParallelGCThreads=####, if not present (optional) . The default value of this parameter is set to the number of logical processors (up to a value of 8). For more than 8 logical processors, the default value is set to 5/8th the number of logical processors.

The optional settings can be used to change the default value of the setting.

The recommended CMS settings for -XX:Xmx and -XX:Xms are described in Configuring NameNode Heap Size. For G1GC, these values can be adjusted as described in the previous section.