Using Apache Storm to Move Data
Also available as:
PDF

Writing Data to Hive

Core Storm and Trident APIs support streaming data directly to Apache Hive using Hive transactions. Data committed in a transaction is immediately available to Hive queries from other Hive clients. You can stream data to existing table partitions, or configure the streaming Hive bolt to dynamically create desired table partitions.

Use the following steps to perform this procedure:

  1. Instantiate an implementation of the HiveMapper Interface.
  2. Instantiate a HiveOptions class with the HiveMapper implementation.
  3. Instantiate a HiveBolt with the HiveOptions class.
Note
Note

Currently, data may be streamed only into bucketed tables using the ORC file format.