4. Apache HCatalog

HCatalog is a metadata abstraction layer for referencing data without using the underlying file­names or formats. It insulates users and scripts from how and where the data is physically stored.

Templeton provides a REST-like web API for HCatalog and related Hadoop components. Application developers make HTTP requests to access the Hadoop MapReduce, Pig, Hive, and HCatalog DDL from within the applications. Data and code used by Templeton is maintained in HDFS. HCatalog DDL commands are executed directly when requested. MapReduce, Pig, and Hive jobs are placed in queue by Templeton and can be monitored for progress or stopped as required. Developers also specify a location in HDFS into which Templeton should place Pig, Hive, and MapReduce results.