Hortonworks Cybersecurity Platform
Also available as:
PDF

Parser Tuning

You can modify certain parser properties to tune your HCP architecture using the Management module. Modifying properties using the Management module is simple and can be performed by any user.

Parsers tend to vary a lot. Some will be very high volume receiving thousands of messages per second and others will be much lower. Rather than using a standard setting for the number of partitions and parallelism, you should base your settings on the expected data volume. That said, use the following guidelines:

  • The spout parallelism should be roughly the same as your Kafka partitions.

  • Consider data flow when assigning Kafka partitions to parsers.

  • Keep in mind the aggregate number of partitions when assigning them to partitions. You do not want to assign the maximum number of partitions to each parser because that can overload your system.

The parser topologies are deployed by a builder pattern that takes parameters from the CLI as set by the Management module. The parser properties materialize as follows:

Management UI -> parser json config and CLI -> Storm

The following table lists the parser properties you can modify in the Management module:

Category Management UI Property Name CLI Option
Storm topology config Num Workers -nw,--num_workers <NUM_WORKERS>
Num Ackers --na,--num_ackers <NUM_ACKERS>
Storm Config <JSON_FILE>, e.g., { "topology.max.spout.pending" : NUM }
Kafka Spout Parallelism -sp,--spout_p <SPOUT_PARALLELISM_HINT>
Spout Num Tasks -snt,--spout_num_tasks <NUM_TASKS>
Spout Config <JSON_FILE>, e.g., { "spout.pollTimeoutMs" : 200 }
Spout Config <JSON_FILE>, e.g., { "spout.maxUncommittedOffsets" : 10000000 }
Spout Config <JSON_FILE>, e.g., { "spout.offsetCommitPeriodMs" : 30000 }
Parser bolt Parser Num Tasks -pnt,--parser_num_tasks <NUM_TASKS>
Parser Parallelism -pp,--parser_p <PARALLELISM_HINT>
Parser Parallelism -pp,--parser_p <PARALLELISM_HINT>

All of the Storm parameters are available in the STORM SETTINGS section of the Management module.

For the Storm config and Spout config properties, you enter the JSON_FILE information in the appropriate field using the JSON format supplied in the following table.

For more detail on starting parsers, see Starting and Stopping Parsers.