Tuning Guide
Also available as:

Parser Tuning

We'll be using the Bro sensor in this example.


The parsers and PCAP use a builder utility, as opposed to enrichments and indexing, which use Flux.

We started with a single partition for the inbound Kafka topics and eventually worked our way up to 48 partitions. And we're using the following pending value, as shown below. The default is 'null' which would result in no limit.


      "topology.max.spout.pending" : 2000

And the following default spout settings. Again, this can be omitted entirely since we are using the defaults.


      "spout.pollTimeoutMs" : 200,
      "spout.maxUncommittedOffsets" : 10000000,
      "spout.offsetCommitPeriodMs" : 30000

And we ran our Bro parser topology with the following options. We did not need to fully match the number of Kafka partitions with our parallelism in this case, though you could certainly do so if necessary. Notice that we only needed 1 worker.

 /usr/metron/0.4.0/bin/start_parser_topology.sh -k $BROKERLIST -z $ZOOKEEPER -s bro -ksp SASL_PLAINTEXT
     -ot enrichments
     -e ~metron/.storm/storm-bro.config \
     -esc ~/.storm/spout-bro.config \
     -sp 24 \
     -snt 24 \
     -nw 1 \
     -pnt 24 \
     -pp 24 \

From the usage docs, here are the options we've used. The full reference can be found here Parsers Readme.

  +-e,--extra_topology_options (JSON_FILE)        Extra options in the form
    +                                               of a JSON file with a map
    +                                               for content.
    +-esc,--extra_kafka_spout_config (JSON_FILE)    Extra spout config options
      +                                               in the form of a JSON file
      +                                               with a map for content.
      +                                               Possible keys are:
      +                                               retryDelayMaxMs,retryDelay
      +                                               Multiplier,retryInitialDel
      +                                               ayMs,stateUpdateIntervalMs
      +                                               ,bufferSizeBytes,fetchMaxW
      +                                               ait,fetchSizeBytes,maxOffs
      +                                               etBehind,metricsTimeBucket
      +                                               SizeInSecs,socketTimeoutMs
      +-sp,--spout_p (SPOUT_PARALLELISM_HINT)         Spout Parallelism Hint
        +-snt,--spout_num_tasks (NUM_TASKS)             Spout Num Tasks
          +-nw,--num_workers (NUM_WORKERS)                Number of Workers
            +-pnt,--parser_num_tasks (NUM_TASKS)            Parser Num Tasks
              +-pp,--parser_p (PARALLELISM_HINT)              Parser Parallelism Hint