Indexing (HDFS) Tuning Example
There are 48 partitions set for the indexing partition, which carries through from the enrichment exercise above. The enrichment output matches the input for the indexing partition.
These are the batch size settings for the Bro index.
cat $METRON_HOME/config/zookeeper/indexing/bro.json { "hdfs" : { "index": "bro", "batchSize": 50, "enabled" : true }... }
And here are the settings we used for the indexing topology:
General storm settings
topology.workers: 4 topology.acker.executors: 24 topology.max.spout.pending: 2000
Spout and Bolt Settings
hdfsSyncPolicy org.apache.storm.hdfs.bolt.sync.CountSyncPolicy constructor arg=100000 hdfsRotationPolicy bolt.hdfs.rotation.policy.units=DAYS bolt.hdfs.rotation.policy.count=1 kafkaSpout parallelism: 24 session.timeout.ms=29999 enable.auto.commit=false setPollTimeoutMs=200 setMaxUncommittedOffsets=10000000 setOffsetCommitPeriodMs=30000 hdfsIndexingBolt parallelism: 24