Apache Hive Performance Tuning
Also available as:
PDF
loading table of contents...

Chapter 2. Hive LLAP on Your Cluster

After setup, Hive LLAP is transparent to Apache Hive users and business intelligence tools. Interactive queries run on Apache Hadoop YARN as an Apache Slider application. You can monitor the real-time performance of the queries through the YARN ResourceManager Web UI or by using Slider and YARN command-line tools. Running through Slider enables you to easily open your cluster, share resources with other applications, remove your cluster, and flexibly utilize your resources. For example, you could run a large Hive LLAP cluster during the day for BI tools, and then reduce usage during nonbusiness hours to use the cluster resources for ETL processing.

Figure 2.1. LLAP on Your Cluster


On your cluster, an extra HiveServer instance is installed that is dedicated to interactive queries. You can see this HiveServer instance listed in the Hive Summary page of Ambari.

In the YARN ResourceManager Web UI, you can see the queue of Hive LLAP daemons or running queries:

Figure 2.2. ResourceManager Web UI


The Apache Tez ApplicationMasters are the same as the selected concurrency. If you selected a total concurrency of 5, you see 5 Tez ApplicationMasters. The following example shows selecting a concurrency of 2:

Figure 2.3. Concurrency Setting