Apache Hive Performance Tuning
Also available as:
PDF

Preparations for tuning performance

Before you tune Apache Hive, you should follow best practices. These guidelines include how you configure the cluster, store data, and write queries.

Best practicces

  • Set up your cluster to use Apache Tez or the Hive on Tez execution engine.

    In HDP 3.0 and later, the MapReduce execution engine is replaced by Tez.

  • Disable user impersonation by setting Run as end user to false in Ambari or by setting doAs = true in hive-site.xml.

    LLAP caches data for multiple queries and this capability does not support user impersonation.

  • Add the Ranger security service to your cluster and dependent services.
  • Activate and configure LLAP if you want to run interactive queries.
  • Store data using the ORCFile format.
  • Ensure that queries are fully vectorized by examining explain plans.
  • Use the SmartSense tool to detect common system misconfigurations.