Understanding Parsing
Also available as:
PDF

Understanding Parsers

Parsers are pluggable components that transform raw data (textual or raw bytes) into JSON messages suitable for downstream enrichment and indexing.

Data flows through the parser bolt via Apache Kafka and into the enrichments topology in Apache Storm. Errors are collected with the context of the error (for example, stacktrace) and the original message causing the error and are sent to an error queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an error queue.

For example, for a Squid parser, NiFi ingests the contents of the Squid proxy access log, the parser transforms the contents of the log, converts it to json, and inserts it into a Squid Kafka topic, which is then passed on to Metron.

HCP supports two types of parsers: general purpose and Java.