2. Technical Preview: Cost-based SQL Optimization

[Note]Note

This feature is a technical preview and considered under development. Do not use this feature in your production systems. If you have questions regarding this feature, contact Support by logging a case on our Hortonworks Support Portal at http://support.hortonworks.com.

Hive 0.13.0 introduces Cost-based optimization of SQL queries. Cost-based optimization may reduce query latency by reducing the cardinality of the intermediate result set. The Cost-based Optimizer improves usability because it removes the need for query hints for good query execution plans. The Cost-based Optimizer is useful in the following scenarios:

  • Ad hoc queries containing multiple views

  • View chaining

  • Business Intelligence tools that act as a front end to Hive

Enabling Cost-based Optimzation

Set the following configuration parameters in hive-site.xml to enable cost-based optimization of SQL:

 

Table 2.1. CBO Configuration Parameters

Configuration ParameterDescriptionDefault Value
hive.cbo.enableIndicates whether to use cost-based query optimization.false
hive.cbo.max.joins.supportedSpecifies the maximum number of joins that may be included in a query that is submitted for cost-based optimization.10



loading table of contents...