Hortonworks Data Platform deploys Apache Hive for your Hadoop cluster.
Hive is a data warehouse infrastructure built on top of Hadoop. It provides tools to enable easy data ETL, a mechanism to put structures on the data, and the capability of querying and analyzing large data sets stored in Hadoop files.
Hive provides SQL on Hadoop, enabling users familiar with SQL to query the data. At the same time, Hive's SQL allows programmers who are familiar with the MapReduce framework to plug in their custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language.
Hive now includes the HCatalog subproject for managing metadata services on your Hadoop cluster. See Using HDP for Metadata Services (HCatalog) for more information.
Hive Documentation
Documentation for Hive release 0.11 can be found in the Hive wiki and javadocs.
The Hive wiki contains documentation organized in these sections:
General Information about Hive
User Documentation
Administrator Documentation
Resources for Contributors
Javadocs describe the Hive API.
Hive JIRAs
Issue tracking for Hive bugs and improvements can be found here: Hive JIRAs.
Hive ODBC Driver
Hortonworks provides a Hive ODBC driver that allows you to connect Hive with popular Business Intelligence (BI) tools to query, analyze and visualize data stored within the Hortonworks Data Platform.
Hive Metastore Scripts
Metastore database initialization and upgrade scripts for Hive 0.11 are exactly the same
as those for Hive 0.10, because the schema did not change. Script names were not changed to
match the new release number. For example, the script
"hive-schema-0.10.0.mysql.sql
" initializes a MySQL database for the
Hive 0.11 metastore. The scripts can be found in
C:\hadoop\hive-0.11.0\scripts\metastore\upgrade\