5. What's New in this Release

HDP 2.2 includes the following new features:

  • Accumulo

    • Accumulo multi-datacenter replication

    • Accumulo on YARN via Slider

  • Falcon

    • Authorization  

    • Lineage Enhancement

    • HCat Replication / Retention

    • Archive to Cloud

  • Flume

    • Flume streaming to Hive for secure and unsecure clusters

  • HBase

    • HBase HA: Timeline-consistent replicas with realtime replication

    • HBase block cache compression

    • HBase on YARN via Slider

  • HDFS

    • Heterogenous Storage: Support for SSD Tier

    • Heterogenous Storage: Support for Archival Tier

    • Operating secure DataNode without requiring root access

    • AES support for faster wire encryption

  • Hive

    • Support for SQL Transactions with ACID semantics

    • Support for SQL Temporary Tables

    • Better optimize queries using Cost Based Optimizer

    • Add performance improvements for various queries

    • Security additions such as Grant, Revoke with a choice of using Native Security or Ranger security with both integrated completely in Hive

  • Knox

    • Support for HDFS HA

    • Installation and configuration with Apache Ambari

    • Service-level authorization with Apache Ranger

    • YARN REST API access

  • Oozie

    • Oozie HA on secure clusters

  • Phoenix

    • Subquery support

    • Robust secondary indexes

    • Build secondary index during bulk import

  • Pig

    • Pig on Tez

    • Including DataFu for use with Pig

  • Ranger

    • Storm Authorization and Auditing

    • Knox Authorization and Auditing

    • Deeper integration with HDP stack: Hive Auth API support for grant/revoke commands, grant/revoke commands in HBase, and Windows support

    • REST API's for policy manager, local audit log storage in HDFS, and support for Oracle DB for policy store and audits

  • Sqoop

    • Support for all Hive types via HCatalog import/export

    • Support for multiple partition keys

    • Integration with Hadoop Credential Management Framework

  • Storm

    • Storm clusters can now be provisioned via Ambari

    • Storm can now run on YARN based clusters using Apache Slider

    • Storm now supports kerberos based authentication and pluggable authorization

    • Pre-built Spouts for JMS, HBase lookups and Bolts for Kafka, Hive.

    • Real-time visualization of running topologies and its associated metrics

    • REST APIs for topology stats

  • Tez

    • Tez Debug Tooling & UI

  • YARN

    • Support long running services: handling of logs, containers not killed when AM dies, secure token renewal, YARN Labels for tagging nodes for specific workloads

    • Support for CPU Scheduling and CPU Resource Isolation through CGroups

    • Work-preserving restarts of ResourceManager and NodeManager

    • Support node labels during scheduling

    • Global, shared cache for application artifacts

    • REST API for YARN application submission and termination

    • Application Timeline Server is supported in a Secure (Kerberized) cluster