Tuning Topologies
Also available as:

Tune Additional Batch Indexing Storm Settings

After you have determined the number of executors and thoroughly tested them, you can set or modify the last remaining Storm parameters.

  1. Based on the capacity you've seen during testing, reduce the overall number of ackers.
    Alternatively, you an leave a single acker per worker as it will ensure that there are no messages sent between Storm workers over the network interface.
  2. Set the Max Spout Pending parameter such that the maximum number of unacked tuples in the topology is close to the Parser Executor capacity (for example, ~0.950).
    If there is a large spike in incoming events, the topology will not become overloaded. For example, you can increase the producer events per second by a large amount and test various values for Max Spout Pending. You can set the value under the Storm settings of the relevant parser.
    vi ~/enrichment.properties
    ##### Storm #####
  3. Check the Executor capacity.
    The executor capacity should not exceed ~0.950. Assuming the number of events generated by the producer is far greater than the capacity of the parser topology, capacity is the only value that you need to monitor in the Storm UI.
  4. If you need to increase the Error Writer Num Executors value, you can directly modify the Flux file and include the parallelism parameter under the appropriate Storm Bolt declarations.
    sudo vi /usr/hcp/current/metron/flux/indexing/batch/remote.yaml
    id: "indexingErrorOutputBolt"
    className: "org.apache.metron.writer.bolt.BulkMesageWriterBolt"
         - "${kafka.zk}"
         - name: "withMessageWriter"
                        - ref: "KafkaWriter"
    parallelism: 3
    Generally, since a small number of errors is expected, you do not need to increase the Error Writer Num Executors value.