Using Apache Storm to Move Data
Also available as:
PDF

Moving Data Into and Out of Apache Storm Using Spouts and Bolts

This chapter focuses on moving data into and out of Apache Storm through the use of spouts and bolts. Spouts read data from external sources to ingest data into a topology. Bolts consume input streams and process the data, emit new streams, or send results to persistent storage. This chapter focuses on bolts that move data from Storm to external sources.

The following spouts are available in HDP 2.5:

  • Kafka spout based on Kafka 0.7.x/0.8.x, plus a new Kafka consumer spout available as a technical preview (not for production use)

  • HDFS

  • EventHubs

  • Kinesis (technical preview)

The following bolts are available in HDP 2.5:

  • Kafka

  • HDFS

  • EventHubs

  • HBase

  • Hive

  • JDBC (supports Phoenix)

  • Solr

  • Cassandra

  • MongoDB

  • ElasticSearch

  • Redis

  • OpenTSDB (technical preview)

Supported connectors are located at /usr/lib/storm/contrib. Each contains a .jar file containing the connector's packaged classes and dependencies, and another .jar file with javadoc reference documentation.

This chapter describes how to use the Kafka spout, HDFS spout, Kafka bolt, Storm-HDFS connector, and Storm-HBase connector APIs.