Hortonworks Cybersecurity Platform
Also available as:
PDF
loading table of contents...

Understanding Throughput

Data flow for HCP occurs in real-time and involves Apache Kafka files ingesting raw telemetry data; parsing it into a structure that HCP can read; enriching it with asset, geo, and threat intelligence information; and indexing and storing the enriched data.

Depending on the type of data streaming into HCP, streaming occurs using Apache NiFi, performance networking ingestion probes, or real-time and batch threat intelligence feed loaders.
  • Apache Kafka ingests information from telemetry data sources rough the telemetry event buffer.

    This information is the raw telemetry data consisting of host logs, firewall logs, emails, and network data. Depending on the type of data you are streaming into HCP, you can use one of the following telemetry data collectors to ingest the data:

    NiFi

    This type of streaming works for most types of telemetry data sources. See the NiFi documentation for more information,

    Performant network ingestion probes

    This type of streaming works for streaming high volume packet data.

    Real-time and batch threat intelligence feed loaders

    This type of streaming works for real-time and batch threat intelligence feed loaders.

  • After the data is ingested into Kafka files, it is parsed into a normalized JSON structure that HCP can read. This information is parsed using a Java or general purpose parser and then it is uploaded to Apache ZooKeeper. A Kafka file containing the parser information is created for every telemetry data source.

  • The information is enriched with asset, geo, and threat intelligence information.

  • The information is indexed and stored, and any resulting alerts are sent to the Metron dashboard.