Hortonworks Cybersecurity Platform
Also available as:
PDF

Statistical Outlier Detection

Data scientists frequently want to find anomalies in numerical data. To that end, HCP has some simple statistical outlier detectors.

Table 1. Statistical Outlier Detection
Function Description Input Returns
OUTLIER_MAD_STATE_MERGE Update the statistical state required to compute the Median Absolute Deviation.
  • state - A list of Median Absolute Deviation States to merge. Generally these are states across time.

  • currentState? - The current state (optional)

The Median Absolute Deviation state
OUTLIER_MAD_ADD Add a piece of data to the state
  • state -The MAD state

  • value - The numeric value to add

The MAD state
OUTLIER_MAD_SCORE Get the modified z-score normalized by the MAD: scale * | x_i - median(X) | / MAD.
  • state - The MAD state

  • value - The numeric value to score

  • scale? -Optionally the scale to use when computing the modified z-score. Default is 0.6745.

The modified z-score