Chapter 1. About Hortonworks Data Platform

Hortonworks Data Platform (HDP) is an open source distribution powered by Apache Hadoop. HDP provides you with the actual Apache-released versions of the components with all the necessary bug fixes to make all the components interoperable in your production environments. It is packaged with an easy to use installer (HDP Installer) that deploys the complete Apache Hadoop stack to your entire cluster. The HDP distribution consists of the following components:

  1. Core Hadoop platform (Hadoop HDFS and Hadoop MapReduce)

  2. Non-relational database (Apache HBase)

  3. Metadata services (Apache HCatalog)

  4. Scripting platform (Apache Pig)

  5. Data access and query (Apache Hive)

  6. Workflow scheduler (Apache Oozie)

  7. Cluster coordination (Apache Zookeeper)

  8. Data integration services (HCatalog APIs, WebHCatalog, WebHDFS)

To learn more about the distribution details and the component versions, see the Release Notes. All components are official Apache releases of the most recent stable versions available. Hortonworks’ philosophy is to do patches only when absolutely necessary to assure interoperability of the components. Consequently, there are very few patches in the HDP, and they are all fully documented. Each of the HDP components have been tested rigorously prior to the actual Apache release. To learn more about the testing strategy adopted at Hortonworks, Inc., see: Delivering high-quality Apache Hadoop releases.