Administration
Also available as:
PDF
loading table of contents...

Parsing a New Data Source to HCP

Parsers transform raw data (textual or raw bytes) into JSON messages suitable for downstream enrichment and indexing by HCP. There is one parser for each data source and the information is piped to the Enrichment/Threat Intelligence topology.

You can transform the field output in the JSON messages into information and formats to make the output more useful. For example, you can change the timestamp field output from GMT to your timezone.

You must make the following decisions before you parse a new data source:

  • Type of parser you will use for your data source

    For more information about which parser to use, see Parsers.

    HCP supports two types of parsers: Java and general purpose:

    • General Purpose - HCP supports two general purpose parsers: Grok and CSV. These parsers are ideal for structured or semi structured logs that are well understood and telemetries with lower volumes of traffic.

    • A Java parser is appropriate for a telemetry type that is complex to parse, with high volumes of traffic.

  • How you will parse the new data source

    HCP enables you to parse a new data source and transform data fields using the HCP Management module or the command line interface. Both methods are described in the following sections:

  • What data you intend to search, sort, and aggregate when using the Alerts UI.

    String values are mapped by default with a "type": "text" mapping that does not work with the Alerts UI. In order to properly enable sorting and aggregate operations, you have two options:

    • Explicitly add a mapping for that property to an Elasticsearch template. You call also refer to the section called “Updating Elasticsearch Templates to Work with Elasticsearch 5.x” for information about using Elasticsearch 5.x.

    • Add a global mapping to Elasticsearch that will automatically map that property to a type that is searchable/sortable/aggregatable for all indexes.

      For example, you can set a template to match all indexes that maps strings to text with fielddata enabled:

      # curl -XPUT 'http://${ES_HOST}:${ES_PORT}/_template/default_string_template' -d '
      {
          "template": "*",
          "mappings" : {
              "${your_type_here}": {
                  "dynamic_templates": [
                  {
                          "strings": {
                              "match_mapping_type": "string",
                              "mapping": {
                                  "type": "text",
                                  "fielddata": "true"
                              }
                          }
                      }

      For information on the difference between types=text and type=keyword, see the section called “Type Mapping Changes”.