Data Access
Also available as:
PDF
loading table of contents...

Chapter 4. Accessing Apache Hive with HCatalog and WebHCat

HCatalog

Hortonworks Data Platform (HDP) deploys HCatalog as a way to access the metadata services of Apache Hive.

HCatalog was designed as a table and storage management service for data created across many different components of Apache Hadoop. However, after release HCatalog in HDP evolved primarily as a way to access Hive services from Apache Pig and MapReduce.

This functionality includes:

  • Providing a shared schema and data type mechanism.

  • Providing a table abstraction so that users need not be concerned with where or how their data is stored.

Start the HCatalog CLI with the following command:

<hadoop-install-dir>\hcatalog-0.5.0\bin\hcat.cmd 
[Note]Note

HCatalog 0.5.0 was the final version released from the Apache Incubator. In March 2013, HCatalog graduated from the Apache Incubator and became part of the Apache Hive project. New releases of Hive include HCatalog, starting with Hive 0.11.0.

For more details about the Apache Hive project, including HCatalog and WebHCat, see the Using Apache Hive chapter and the following resources:

For information about running Apache Pig with HCatalog, see HCatalog LoadStore on the Apache Hive wiki.

To run MapReduce jobs on HCatalog-managed tables, see the HCatalog InputOutput page on the wiki.

WebHCat

WebHCat is a REST API that supports HTTP-request interfaces, including web GUIs, for users who want an alternative to a command-line environment for working with Hive databases, Pig, and MapReduce.

WebHCat Community Information

[Note]Note

WebHCat was originally named Templeton, and both terms may still be used interchangeably. For backward compatibility the Templeton name still appears in URLs and log file names.

For details about WebHCat (Templeton), see the following resources:

Security for WebHCat

WebHCat currently supports two types of security:

  • Default security (without additional authentication)

  • Authentication by using Kerberos

Example: HTTP GET :table

The following example demonstrates how to specify the user.name parameter in an HTTP GET request:

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table?user.name=ctdean'

Example: HTTP POST :table

The following example demonstrates how to specify the user.name parameter in an HTTP POST request

% curl -s -d user.name=ctdean \
       -d rename=test_table_2 \ 
       'http://localhost:50111/templeton/v1/ddl/database/default/table/
          test_table'

Security Error

If the user.name parameter is not supplied when required, the following security error is returned:

{ 
   "error": "No user found. Missing user.name parameter."
}