Hortonworks Cybersecurity Platform
Also available as:
PDF

Median Absolution Deviation

Much has been written about this robust estimator. See the first page of Alternatives to the Median Absolute Deviation for coverage of the good and the bad of median absolution deviation (MAD). The usage, however is fairly straightforward.

  • Gather the statistical state required to compute the MAD.

      • The distribution of the values of a univariate random variable over time.

      • The distribution of the absolute deviations of the values from the median.

    • Use this statistical state to score unseen values. The higher the score, the more unlike the previously seen data the value is.

There are a couple of issues which make MAD hard to compute. First, the statistical state requires computing median, which can be computationally expensive to compute exactly. To get around this, we use the OnlineStatisticalProvider to compute a sketch rather than the exact median. Secondly, the statistical state for seasonal data should be limited to a fixed, trailing window. We do this by ensuring that the MAD state is mergeable and able to be queried from within the Profiler.