Apache Hive Performance Tuning
Also available as:

Chapter 3. Best Practices Prior to Tuning Performance


Before tuning an Apache Hive system in depth, ensure that you adhere to the following best practices with your Hive deployment.

Use ORCFile format

Store all data in ORCFile format. See Maximizing Storage Resources for Hive.

Run the Hive EDW on Tez

Use Hive LLAP with Tez or the Hive on Tez execution engine rather than MapReduce.

Verify LLAP status for interactive queries

If you want to run interactive queries, ensure that the Hive LLAP engine is activated and configured. The best way to enable Hive LLAP or check if it is enabled is to use Ambari.

Check explain plans

Ensure queries are fully vectorized by examining their explain plans. See Optimizing the Hive Execution Engine for a conceptual explanation of different explain plans. Go directly to the Query Tab section of the Ambari Hive View 2.0 documentation to learn how to generate and interpret visual explain plans.

Run SmartSense

Use the SmartSense tool to detect common system misconfigurations. See the SmartSense documentation site for more information.