Chapter 2. Using HDP for Metadata Services (HCatalog)

Hortonworks Data Platform deploys Apache HCatalog to manage the metadata services for your Hadoop cluster.

Apache HCatalog is a table and storage management service for data created using Apache Hadoop. This includes:

  • Providing a shared schema and data type mechanism.

  • Providing a table abstraction so that users need not be concerned with where or how their data is stored.

  • Providing interoperability across data processing tools such as Pig, MapReduce, and Hive.

Start the HCatalog CLI with the command '<hadoop-install-dir>\hcatalog-0.5.0\bin\hcat.cmd'.

HCatalog Documentation

For details about HCatalog see the Apache HCatalog documentation, which includes the following resources:

Using WebHCat (Templeton)

WebHCat provides a REST-like web API for HCatalog and related Hadoop components.

[Note]Note

WebHCat was originally named Templeton, and both terms may still be used interchangeably.

For details about WebHCat, see the following resources:

Corrections to Installation from Tarball (see HCATALOG-625)

  • Replace the section "Building a tarball" with this:

    If you downloaded HCatalog from Apache or another site as a source release, you will need to first build a tarball to install. You can tell if you have a source release by looking at the name of the object you downloaded. If it is named hcatalog-src-0.5.0-incubating.tar.gzs (notice the src in the name) then you have a source release.

    If you do not already have Apache Ant installed on your machine, you will need to obtain it. You can get it from the Apache Ant website. Once you download it, you will need to unpack it somewhere on your machine. The directory where you unpack it will be referred to as ant_home in this document.

    To produce a binary tarball from downloaded src tarball, execute the following steps:

    tar xzf hcatalog-src-0.5.0-incubating.tar.gz
    cd hcatalog-src-0.5.0-incubating
    ant_home/bin/ant package
    

    The tarball for installation should now be at build/hcatalog-0.5.0-incubating.tar.gz .

  • In the "Thrift Server Setup" section:

    • In the third paragraph, replace Hive 0.9 with the current version of Hive.

    • Replace these commands:

      tar zxf hcatalog-0.5.0.tar.gz
      cd hcatalog-0.5.0
      

      with these:

      tar zxf hcatalog-0.5.0-incubating.tar.gz
      cd hcatalog-0.5.0-incubating
      
    • In the next paragraph, add this sentence:

      If there is no hive-site.xml file in the hive conf directory, copy hcat_home/etc/hcatalog/proto-hive-site.xml and rename it hive-site.xml in hive_home/conf/.

  • In the "Client Installation" section, replace this command:

    tar zxf hcatalog-0.5.0-incubating.tar.gz
    

    with this:

    tar zxf hcatalog-0.5.0-incubating.tar.gz
    

Additional Information

For more details on the Apache HCatalog project, use the following resources: