Tuning Guide
Also available as:
PDF

Indexing (HDFS) Tuning Example

There are 48 partitions set for the indexing partition, which carries through from the enrichment exercise above. The enrichment output matches the input for the indexing partition.

These are the batch size settings for the Bro index.

cat $METRON_HOME/config/zookeeper/indexing/bro.json
{
"hdfs" : {
"index": "bro",
    "batchSize": 50,
    "enabled" : true
  }...
}
  

And here are the settings we used for the indexing topology:

General storm settings

topology.workers: 4
topology.acker.executors: 24
topology.max.spout.pending: 2000

Spout and Bolt Settings

hdfsSyncPolicy
    org.apache.storm.hdfs.bolt.sync.CountSyncPolicy
    constructor arg=100000
hdfsRotationPolicy
    bolt.hdfs.rotation.policy.units=DAYS
    bolt.hdfs.rotation.policy.count=1
kafkaSpout
    parallelism: 24
    session.timeout.ms=29999
    enable.auto.commit=false
    setPollTimeoutMs=200
    setMaxUncommittedOffsets=10000000
    setOffsetCommitPeriodMs=30000
hdfsIndexingBolt
    parallelism: 24