YARN Resource Management
Also available as:
PDF
loading table of contents...

Managing Cluster Capacity with Queues

The Capacity Scheduler is designed to allow organizations to share compute clusters using the very familiar notion of FIFO (first-in, first-out) queues. YARN does not assign entire nodes to queues. Queues own a fraction of the capacity of the cluster, and this specified q ueue capacity can be fulfilled from any number of nodes in a dynamic fashion.

Scheduling is the process of matching resource requirements -- of multiple applications from various users, and submitted to different queues at multiple levels in the queue hierarchy -- with the free capacity available on the nodes in the cluster. Because total cluster capacity can vary, capacity configuration values are expressed as percents.

The capacity property can be used by administrators to allocate a percentage of cluster capacity to a queue. The following properties would divide the cluster resources between the Engineering, Support, and Marketing organizations in a 6:1:3 ratio (60%, 10%, and 30%).

Property: yarn.scheduler.capacity.root.engineering.capacity

Value: 60

Property: yarn.scheduler.capacity.root.support.capacity

Value: 10

Property: yarn.scheduler.capacity.root.marketing.capacity

Value: 30

Now suppose that the Engineering group decides to split its capacity between the Development and QA sub-teams in a 1:4 ratio. That would be implemented by setting the following properties:

Property:yarn.scheduler.capacity.root.engineering.development.capacity

Value: 20

Property: yarn.scheduler.capacity.root.engineering.qa.capacity

Value: 80

[Note]Note

The sum of capacities at any level in the hierarchy must equal 100%. Also, the capacity of an individual queue at any level in the hierarchy must be 1% or more (you cannot set a capacity to a value of 0).

The following image illustrates this cluster capacity configuration: