Developing Apache Storm Applications
Also available as:
PDF

Functions

Trident functions are similar to Storm bolts, in that they consume individual tuples and optionally emit new tuples. An important difference is that tuples emitted by Trident functions are additive.

Fields emitted by Trident functions are added to the tuple and existing fields are retained. The Split function in the word count example illustrates a function that emits additional tuples:

  public class Split extends BaseFunction {

    public void execute(TridentTuple tuple, TridentCollector collector) {
      String sentence = tuple.getString(0);
      for (String word : sentence.split(" ")) {
        collector.emit(new Values(word));
      }
    }
  }

Note that the Split function always processes the first (index 0) field in the tuple. It guarantees this because of the way that the function was applied using the Stream.each() method:

stream.each(new Fields("sentence"), new Split(), new Fields("word"))

The first argument to the each() method can be thought of as a field selector. Specifying “sentence” tells Trident to select only that field for processing, thus guaranteeing that the “sentence” field will be at index 0 in the tuple.

Similarly, the third argument names the fields emitted by the function. This behavior allows both filters and functions to be implemented in a more generic way, without depending on specific field naming conventions.