Configure a queue for batch processing
You can configure the capacity scheduler queues to scale a Hive batch job for your environment. YARN uses the queues to allocate Hadoop cluster resources among users and groups.
In this task, you create queues and set up a capacity scheduler to separate short- and long-running queries into the queues:
- This queue is used for short-duration queries and is assigned 50 percent of cluster resources.
- This queue is used for longer-duration queries and is assigned 50 percent of cluster resources.
In Ambari, access the capacity scheduler:
, and in Filter enter
- On the command line of the node where YARN is installed, go to the YARN /conf file, and open the capacity-scheduler.xml file.
hive2queues, and set the maximum capacity to 50 percent of the queue users with a hard limit.For example:
yarn.scheduler.capacity.root.queues=hive1,hive2 yarn.scheduler.capacity.root.hive1.capacity=50 yarn.scheduler.capacity.root.hive2.capacity=50If the maximum-capacity is set to more than 50 percent, the queue can use more than its capacity when there are other idle resources in the cluster.
Configure usage limits for these queues and their users.
The default value of 1 for user-limit means that any single user in the queue can at a maximum occupy 1X the queue's configured capacity. These settings prevent users in one queue from monopolizing resources across all queues in a cluster.
yarn.scheduler.capacity.root.hive1.maximum-capacity=50 yarn.scheduler.capacity.root.hive2.maximum-capacity=50 yarn.scheduler.capacity.root.hive1.user-limit=1 yarn.scheduler.capacity.root.hive2.user-limit=1
- From the Ambari dashboard, select .
- Click the URL for the view named AUTO_CS_INSTANCE, which is the capacity scheduler view.
- In the YARN Queue Manager, click Add Queue.
- Enter the queue path, which is the name of the first queue hive1, and then add the hive2 queue.
To create the following schedule, select the
rootqueue and add
hive2at that level: