YARN Resource Management
Also available as:
PDF
loading table of contents...

Configuring CPU Scheduling

Use the following steps to configure CPU scheduling.

  1. Enable CPU Scheduling in capacity-scheduler.xml

    CPU scheduling is not enabled by default. To enable CPU sheduling, set the following property in the /etc/hadoop/conf/capacity-scheduler.xml file on the ResourceManager and NodeManager hosts:

    Replace the DefaultResourceCalculator portion of the <value> string with DominantResourceCalculator:

    Property:yarn.scheduler.capacity.resource-calculator

    Value:org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

    <property>
     <name>yarn.scheduler.capacity.resource-calculator</name>
     <!-- <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value> -->
     <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
    </property>
  2. Set Vcores in yarn-site.xml

    In YARN, vcores (virtual cores) are used to normalize CPU resources across the cluster. The yarn.nodemanager.resource.cpu-vcores value sets the number of CPU cores that can be allocated for containers.

    You should set the number of vcores to match the number of physical CPU cores on the NodeManager hosts. Set the following property in the /etc/hadoop/conf/yarn-site.xml file on the ResourceManager and NodeManager hosts:

    Property: yarn.nodemanager.resource.cpu-vcores

    Value: <number_of_physical_cores>

    Example:

    <property>
     <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>16</value>
    </property>

You also should enable CGroups along with CPU scheduling. CGroups are used as the isolation mechanism for CPU processes. With CGroups strict enforcement activated, each CPU process receives only the resources it requests. Without CGroups activated, the DRF scheduler attempts to balance the load, but unpredictable behavior may occur.