HiveWarehouseConnector for accessing Apache Spark data
HiveWarehouseConnector is a library you use to read and write Apache Spark DataFrames and Streaming DataFrames to and from Apache Hive using low-latency, analytical processing (LLAP).
Apache Ranger and the HiveWarehouseConnector library provide row and column, fine-grained access to Spark data in Hive.
- Spark shell
- The spark-submit script
- Describing a table
- Creating a table for ORC-formatted data
- Selecting Hive data and retrieving a DataFrame
- Writing a DataFram to Hive in batch
- Executing a Hive update statment
- Reading table data from Hive, transforming it in Spark, and writing it to a new Hive table
- Writing a DataFrame or Spark stream to Hive using HiveStreaming