Apache Spark access to Apache Hive
From Apache Spark, you access ACID v2 tables and external tables in Apache Hive 3 using the Hive Warehouse Connector.
The HiveWarehouseConnector library is a Spark library built on top of Apache Arrow for accessing Hive ACID and external tables for reading and writing from Spark.
You must deploy LLAP for secure reading of ACID tables. The Hive Warehouse Connector is optimized for fast transmission of data from LLAP to Spark and designed to leverage the LLAP cache. The connector orchestrates a distributed read from multiple LLAP daemons. The read from cache occurs after applying security rules and ACID transformations.
To write to ACID tables with this library, you do not need to deploy LLAP. The library internally uses the Hive Streaming API and LOAD DATA Hive commands to write the data.