Developing Apache Storm Applications
Also available as:
PDF

Stream Groupings

Stream grouping allows Storm developers to control how tuples are routed to bolts in a workflow. The following table describes the stream groupings available.

Table 1. Stream Groupings

Stream Grouping

Description

Shuffle

Sends tuples to bolts in random, round robin sequence. Use for atomic operations, such as math.

Fields

Sends tuples to a bolt based on one or more fields in the tuple. Use to segment an incoming stream and to count tuples of a specified type.

All

Sends a single copy of each tuple to all instances of a receiving bolt. Use to send a signal, such as clear cache or refresh state, to all bolts.

Custom

Customized processing sequence. Use to get maximum flexibility of topology processing based on factors such as data types, load, and seasonality.

Direct

Source decides which bolt receives a tuple.

Global

Sends tuples generated by all instances of a source to a single target instance. Use for global counting operations.

Storm developers specify the field grouping for each bolt using methods on the TopologyBuilder.BoltGetter inner class, as shown in the following excerpt from the the WordCountTopology.java example included with storm-starter.

...
TopologyBuilder builder = new TopologyBuilder(); 
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
...

The first bolt uses shuffle grouping to split random sentences generated with the RandomSentenceSpout. The second bolt uses fields grouping to segment and perform a count of individual words in the sentences.