2. Tuning for a Mix of Interactive and Batch Hive Queries

In general, adjustments for interactive queries will not adversely affect batch queries, so both types of queries can usually run well together on the same cluster. You can use Capacity Scheduler queues to divide cluster resources between batch and interactive queries. For example, you might set up a configuration that allocates 50% of the cluster capacity to a “default” queue for batch jobs, and two queues for interactive Hive queries, with each assigned 25% of cluster resources:

yarn.scheduler.capacity.root.queues=default,hive1,hive2
yarn.scheduler.capacity.root.default.capacity=50
yarn.scheduler.capacity.root.hive1.capacity=25
yarn.scheduler.capacity.root.hive2.capacity=25

The following settings would enable the capacity of the batch queue to expand to 100% when no one else is using the cluster (at night, for example). The maximum-capacity of the default (batch) queue is set to 100%, and the user-limit-factor is increased to 2 to enable the queue users to occupy twice the queue’s configured capacity of 50%.

yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.user-limit-factor=2