Run Book
Also available as:
PDF

Bulk Loading Enrichment Information

Bulk loading is used to load information that does not change frequently. For example, bulk loading is ideal for loading from an asset database on a daily or even weekly basis because you don't typically change the number of assets on a network very often.

Enrichment data can be bulk loaded from the local file system, HDFS. The enrichment loader transforms the enrichment into a JSON format that is understandable to Metron. The loading framework has additional capabilities for aging data out of the enrichment stores based on time. Once the stores are loaded, an enrichment bolt that can interact with the enrichment store can be incorporated into the enrichment topology.

You can bulk load enrichment information from the following sources:

  • CSV Flat File Ingestion

  • HDFS via MapReduce

  • Taxii Loader

For our example, we will use CSV flat file ingestion. For more information about Taxii Loader, see Bulk Loading Threat Intelligence Information. For more information about HDFS via MapReduce, see Bulk Loading Threat Intelligence Information

Bulk loading information consists of the following major steps:

TaskDescription

OPTIONAL: Create a Mock Enrichment Source

For our runbook demonstration we create a mock enrichment source. In your production environment you will want to use a genuine enrichment source.

Configuring an Extractor Configuration File

The extractor configuration file is used to bulk load the enrichment store into HBase.

Configuring Element-to-Enrichment Mapping

This section configures what element of a tuple should be enriched with what enrichment type.

Running the Enrichment Loader

After the enrichment source and enrichment configuration are defined, you must run the loader to move the data from the enrichment source to the HCP enrichment store (HBase) and store the enrichment configuration in ZooKeeper.