Creating Profiles
Also available as:
PDF

Run the Batch Profiler in Advanced Mode

As an alternative to using the start_batch_profiler.sh you can run the batch profiler in advanced mode. Running the batch profiler in advanced mode allows you to specify certain arguments to customize the profiler.

Start the batch profile by entering the following:
${SPARK_HOME}/bin/spark-submit \
    --class org.apache.metron.profiler.spark.cli.BatchProfilerCLI \
    --properties-file ${SPARK_PROPS_FILE} \
    ${METRON_HOME}/lib/metron-profiler-spark-*.jar \
    --config ${PROFILER_PROPS_FILE} \
    --profiles ${PROFILES_FILE}
The batch profiler accepts the following arguments when run from the command line. All arguments following the profiler jar are passed to the profiler. All argument preceeding the profiler jar are passed to Spark.
-p, --profiles
The path to a file containing the profile definition in JSON. Only one of --zookeeper or --profiles should be used.
-z, --zookeeper
The ZooKeeper quorum from which to read profile definitions. Only one of --zookeeper or --profiles should be used.
-t, --timestampfield
Specifies which data field to use for event time. The field to use for event time is usually stored as part of the profile. It can be overridden via this setting.
-c, --config
The path to a file containing key-value properties for the profiler. This file contains the properties described in Batch Profiler Properties.
-g, --globals
The path to a file containing key-value properties that define the global properties. You can use this property to customize how certain Stellar functions behave during execution.
-r, --reader
The path to a file containing key-value properties that are passed to the DataFrameReader when reading the input telemetry. This allows additional customization for how the input telemetry is read.