Apache Storm Component Guide
Also available as:
PDF
loading table of contents...

Chapter 5. Moving Data Into and Out of Apache Storm Using Spouts and Bolts

This chapter focuses on moving data into and out of Apache Storm through the use of spouts and bolts. Spouts read data from external sources to ingest data into a topology. Bolts consume input streams and process the data, emit new streams, or send results to persistent storage. This chapter focuses on bolts that move data from Storm to external sources.

The following spouts are available in HDP 2.5:

  • Kafka spout based on Kafka 0.7.x/0.8.x, plus a new Kafka consumer spout available as a technical preview (not for production use)

  • HDFS

  • EventHubs

  • Kinesis (technical preview)

The following bolts are available in HDP 2.5:

  • Kafka

  • HDFS

  • EventHubs

  • HBase

  • Hive

  • JDBC (supports Phoenix)

  • Solr

  • Cassandra

  • MongoDB

  • ElasticSearch

  • Redis

  • OpenTSDB (technical preview)

Supported connectors are located at /usr/lib/storm/contrib. Each contains a .jar file containing the connector's packaged classes and dependencies, and another .jar file with javadoc reference documentation.

This chapter describes how to use the Kafka spout, HDFS spout, Kafka bolt, Storm-HDFS connector, and Storm-HBase connector APIs. For information about connecting to components on a Kerberos-enabled cluster, see Configuring Connectors for a Secure Cluster.