Integrating Apache Hive with Spark and BI
Also available as:
PDF

Submit a Hive Warehouse Connector Python app

You can submit an Python app based on the HiveWarehouseConnector library by following the steps to submit a Scala or Java application, and then adding a Python package.

  1. Locate the hive-warehouse-connector-assembly jar in /usr/hdp/current/hive_warehouse_connector/.
  2. Add the connector jar to the app submission using --jars.
    spark-shell --jars /usr/hdp/current/hive-warehouse-connector/hive-warehouse-connector-assembly-<version>.jar
  3. Locate the pyspark_hwc zip package in /usr/hdp/current/hive_warehouse_connector/.
  4. Add the Python package to app submission:
    spark-shell --jars /usr/hdp/current/hive-warehouse-connector/hive-warehouse-connector-assembly-1.0.0.jar
  5. Add the Python package for the connector to the app submission.
    pyspark --jars /usr/hdp/current/hive-warehouse-connector/hive-warehouse-connector-assembly-<version>.jar --py-files /usr/hdp/current/hive-warehouse-connector/pyspark_hwc-<version>.zip