Hortonworks Cybersecurity Platform
Also available as:
PDF

Approximation Statistics

Data scientists frequently want to find anomalies in numerical data. To that end, HCP has some simple approximation statistics.

Table 1. Approximation Statistics
Function Description Input Returns
HLLP_ADD Add value to the HyperLogLogPlus estimator set.
  • hyperLogLogPlus - The hllp estimator to add a value to

  • value+ - Value to add to the set. Takes a single item or a list.

The HyperLogLogPlus set with a new value added
HLLP_CARDINALITY Returns HyperLogLogPlus-estimated cardinality for this set.
  • hyperLogLogPlus - The hllp set

Long value representing the cardinality for this set
HLLP_INIT Initializes the HyperLogLogPlus estimator set. p must be a value between 4 and sp and sp must be less than 32 and greater than 4.
  • p - The precision value for the normal set

  • sp - The precision value for the sparse set. If p is set, but sp is 0 or not specified, the sparse set will be disabled.

A new HyperLogLogPlus set
HLLP_MERGE Merge hllp sets together. The resulting estimator is initialized with p and sp precision values from the first provided hllp estimator set.
  • hllp - List of hllp estimators to merge. Takes a single hllp set or a list.

True if the filter might contain the value and false otherwise