YARN Resource Management
Also available as:
PDF
loading table of contents...

Using Flexible Scheduling Policies

The default ordering policy in Capacity Scheduler is FIFO (First-In, First-Out). FIFO generally works well for predicable, recurring batch jobs. but sometimes not as well for on-demand or exploratory workloads. For these types of jobs, Fair Sharing is often a better choice. Flexible scheduling policies enable you to assign FIFO or Fair ordering polices for different types of workloads on a per-queue basis.

FIFO vs. Fair Sharing

Batch Example

In this example, two queues have the same resources available. One uses the FIFO ordering policy, and the the other uses the Fair Sharing policy. A user submits three jobs to each queue one right after another, waiting just long enough for each job to start. The first job uses 6x the resource limit in the queue, the second 4x, and last 2x.

  • In the FIFO queue, the 6x job would start and run to completion, then the 4x job would start and run to completion, and then the 2x job. They would start and finish in the order 6x, 4x, 2x.

  • In the Fair queue, the 6x job would start, then the 4x job, and then the 2x job. All three would run concurrently, with each using 1/3 of the available application resources. They would typically finish in the following order: 2x, 4x, 6x.

Ad Hoc Plus Batch Example

In this example, a job using 10x the queue resources is running. After the job is halfway complete, the same user starts a second job needing 1x the queue resources.

  • In the FIFO queue, the 10x job will run until it no longer uses all queue resources (map phase complete, for example), and then the 1x job will start.

  • In the Fair queue, the 1x job will start, run, and complete as soon as possible – picking up resources from the 10x job by attrition.

Configuring Queue Ordering Policies

Ordering policies are configured in capacity-scheduler.xml.

To specify ordering policies on a per-queue basis, set the following property to fifo or fair. The default setting is fifo.

<property>
  <name>yarn.scheduler.capacity.<queue-path>.ordering-policy</name>
  <value>fair</value>
</property>

You can use the following property to enable size-based weighting of resource allocation. When this property is set to true, queue resources are assigned to individual applications based on their size, rather than providing an equal share of queue resources to all applications regardless of size. The default setting is false.

<property>
<name>yarn.scheduler.capacity.<queue-path>.ordering-policy.fair.enable-size-based-weight</name>
<value>true</value>
</property>

Best Practices for Ordering Policies

  • Ordering policies are configured on a per-queue basis, with the default ordering policy set to FIFO. Fairness is usually best for on-demand, interactive, or exploratory workloads, while FIFO can be more efficient for predictable, recurring batch processing. You should segregate these different types of workloads into queues configured with the appropriate ordering policy.

  • In queues supporting both large and small applications, large applications can potentially "starve" (not receive sufficient resources). To avoid this scenario, use different queues for large and small jobs, or use size-based weighting to reduce the natural tendency of the ordering logic to favor smaller applications.

  • Use the yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent property to restrict the number of concurrent applications running in the queue to avoid a scenario in which too many applications are running simultaneously. Limits on each queue are directly proportional to their queue capacities and user limits. This property is specified as a float, for example: 0.5 = 50%. The default setting is 10%. This property can be set for all queues using the yarn.scheduler.capacity.maximum-am-resource-percent property, and can also be overridden on a per-queue basis using the yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent property.