Administration
Also available as:
PDF
loading table of contents...

Appendix A. Stellar Language Functions

This section provides Stellar language functions supported by Hortonworks Cybersecurity Package (HCP) powered by Apache Metron.

The Stellar query language supports the following:

  • Referencing fields in the enriched JSON

  • Simple boolean operations: and, not, or

  • Simple arithmetic operations: *, /, +, - on real numbers or integers

  • Simple comparison operations <, >, <=, >=

  • if/then/else comparisons (for example, if var1 < 10 then 'less than 10' else '10 or more')

  • Determining whether a field exists (via exists)

  • The ability to have parenthesis to make order of operations explicit

  • User defined functions

The following keywords need to be single quote escaped in order to be used in Stellar expressions:

Table A.1. Stellar Language Keywords

notelseexistsifthen
andorin==!=
/<=/>/>=/+/-
/<?/*/,

Stellar Language Inclusion Checks ("in" and "not in")

  • "in" supports string contains. e.g., "'foo' in 'foobar' == true"

  • "in" supports collection contains. e.g., "'foo' in [ 'foo', 'bar' ] == true"

  • "in" supports map key contains. e.g., "'foo' in { 'foo' : 5} == true"

  • "not in" is the negation of the in expression. e.g., "'grok' not in 'foobar' == true`"

Stellar Language Comparisons (`<`, `<=`, `>`, `>=`)

  • If either side of the comparison is null then return false.

  • If both values being compared implement number then the following:

    • If either side is a double then get double value from both sides and compare using given operator.

    • Else if either side is a float then get float value from both sides and compare using given operator.

    • Else if either side is a long then get long value from both sides and compare using given operator.

    • Otherwise get the int value from both sides and compare using given operator.

  • If both sides are of the same type and are comparable then use the compareTo method to compare values.

  • If none of the above are met then an exception is thrown.

Stellar Language Equality Check (`==`, `!=`)

Below is how the `==` operator is expected to work:

  • 1. If either side of the expression is null then check equality using Java's `==` expression.

  • Else if both sides of the expression are of Java's type Number then:

    • If either side of the expression is a double then use the double value of both sides to test equality.

    • Else if either side of the expression is a float then use the float value of both sides to test equality.

    • Else if either side of the expression is a long then use long value of both sides to test equality.

    • Otherwise use int value of both sides to test equality

  • Otherwise use equals method compare the left side with the right side.

The `!=` operator is the negation of the above.

Table A.2. Stellar Language Functions

FunctionDescriptionInputReturns
ABSReturns the absolute value of a numbernumber - The number to take the absolute value ofThe absolute value of the number passed in
APPEND_IF_MISSINGAppends the suffix to the end of the string if the string does not already end with any of the suffixes.
  • string - The string to be appended.

  • suffix - The string suffix to append to the end of the string.

  • additionalsuffix - Optional - Additional string suffix that is a valid terminator.

A new string if prefix was prepended, the same string otherwise.
BINComputes the bin that the value is in given a set of bounds
  • value - the value to bin

  • bounds -A list of value bounds (excluding min and max) in sorted order

Which bin N the value falls in such that bound(N-1) <value <= bound(N). No min and max bounds are provided, so values small than the 0'th bound go in the 0'th bin, and values great than the last bound go in the M'th bin.
BLOOM_ADDAdds an element to the bloom filter passed in
  • bloom - The bloom filter

  • value* - The values to add

Bloom Filter
BLOOM_EXISTSIf the bloom filter contains the value
  • bloom - The bloom filter

  • value - The value to check

True if the filter might contain the value and false otherwise
BLOOM_INITReturns an empty bloom filter
  • expectedInsertions - The expected insertions

  • falsePositiveRate - The false positive rate you are willing to tolerate

Bloom Filter
BLOOM_MERGEReturns a merged bloom filter
  • bloomfilters - A list of bloom filters to merge

Bloom Filter or null if the list is empty
CHOPRemove the last character from a string.
  • string- The string to chop last character from, may be null.

String without last character, null if null string input.
CHOMPRemoves one newline from end of a string if its there, otherwise leaves it alone. A newline is "/n", "/r", "/r/n".
  • The string to chomp a newline from, may be null.

String without newline, null if null string input.
COUNT_MATCHESCounts how many times the substring appears in the larger string.
  • string - The CharSequence to check, may be null.

  • substring/character - The number of non-overlapping occurrences, 0 if either CharSequence is null.

 
DAY_OF_MONTHThe numbered day within the month. The first day within the month has a value if 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The numbered day within the month
DAY_OF_WEEKThe numbered day within the week. The first day of the week, Sunday, has a value of 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The numbered day within the week.
DAY_OF_THE_YEARThe day number within the year. The first day of the year has value of 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The day number within the year
DOMAIN_REMOVE_SUBDOMAINSRemove subdomains from a domain
  • domain - Fully qualified domain name

The domain without the subdomains. (For example, DOMAIN_REMOVE_SUBDOMAINS ('mail.yahoo.com') yields 'yahoo.com')
DOMAIN_REMOVE_TLDRemoves the top level domain (TLD) suffix from a domain
  • domain - Fully qualified domain name

The domain without the TLD. (For example, DOMAIN_REMOVE_TLD('mail.yahoo.co.uk') yields 'mail.yahoo')
DOMAIN_TO_TLDExtracts the top level domain from a domain
  • domain - Fully qualified domain name

The domain of the TLD. (For example, DOMAIN_TO_TLD('mail.yahoo.com.uk') 'yields 'co.uk')
ENDS_WITHDetermines whether a string ends with a suffix
  • string - The string to test

  • suffix - The proposed suffix

True if the string ends with the specified suffix and false if otherwise
ENRICHMENT_EXISTSInterrogates the HBase table holding the simple HBase enrichment data and returns whether the enrichment type and indicator are in the table
  • enrichment_type - The enrichment type

  • indicator - The string indicator to look up

  • nosql_table - The NoSQL table to use

  • column_family - The column family to use

True if the enrichment indicator exists and false otherwise
ENRICHMENT_GETInterrogates the HBase table holding the simple HBase enrichment data and retrieves the tabular value associated with the enrichment type and indicator
  • enrichment_type - The enrichment type

  • indicator - The string indicator to look up

  • nosql_table - The NoSQL table to use

  • column_family - The column family to use

A map associated with the indicator and enrichment type. Empty otherwise.
FILL_LEFTFills or pads a given string with a given character, to a given length on the left
  • input - string

  • fill - the fill character

  • len - the required length

The filled string
FILL_RIGHTFills or pads a given string with a given character, to a given length on the right
  • input - string

  • fill - the fill character

  • len - the required length

Last element of the list
FILTERApplies a filter in the form of a lambda expression to a list. For example, `FILTER( [ 'foo', 'bar' ] , (x) -> x == 'foo')` would yield `[ 'foo'.
  • list - List of arguments.

  • predicate - The lambda expression to apply. This expression is assumed to take one argument and return a boolean.

The input list filtered by the predicate.
FORMATReturns a formatted string using the specified format string and arguments. Uses Java's string formatting conventions
  • format - string

  • arguments - object(s)

A formatted string
GEO_GETLook up an IPV4 address and returns geographic information about it.
  • ip - The IPV4 address to look up

  • fields – Optional list of GeoIP fields to grab. Options are locID, country, city postalCode, dmaCode, latitude, longitude, location_point

  • len - the required length

If a Single field is requested, a string of the field. If multiple fields are requested, a map of string of fields. Otherwise null.
GETReturns the i'th element of the list
  • input - List

  • i - The index (0-based)

First element of the list
GET_FIRSTReturns the first element of the list
  • input - List

First element of this list
GET_LASTReturns the last element of the list
  • input - List

Last element of the list
HLLP_CARDINALITYReturns HyperLogLogPlus-estimated cardinality for this set.
  • input - hyperLogLogPlus - the hllp set

Long value representing the cardinality for this set
HLLP_INITInitializes the set
  • p (required) - The precision value for the sparse set.

  • sp - The precision value for the sparse set. If sp Is not specified the sparse set will be disabled.

A new HyperLogLogPlus set
HLLP_MERGEMerge hllp sets together
  • hllp1 - First hllp set

  • hllp2 - Second hllp set

  • hllpn - Additional sets to merge

A new merged HyperLogLogPlus estimator set
HLLP_OFFERAdd value to set
  • hyperLogLogPlus - The hllp set

  • o - Object to add to the set

The HyperLogLogPlus set with a new object added
IN_SUBNETReturns true if an IP is within a subnet range
  • ip - The IP address in string form

  • cidr+ - One or more IP ranges specified in CIDR notation (for example, 192.168.0.0/24)

True if the IP address is within at least one of the network ranges and false if otherwise
IS_DATEDetermines if the date contained in the string conforms to the specified format
  • date - The date in string form

  • format - The format of the date

True if the date is in the specified format and false if otherwise
IS_DOMAINTests if a string is a valid domain. Domain names are evaluated according to the standards RFC1034 Section 3, and RFC1123 section 2.1.
  • address - The string to test

True if the string is a valid domain and false if otherwise
IS_EMAILTests if a string is a valid email address
  • address -The string to test

True if the string is a valid email address and false if otherwise
IS_EMPTYReturns true if string or collection is empty or null and false if otherwise
  • input - Object of string or collection type (for example, list)

True if the string or collection is empty or null and false if otherwise
IS_INTEGERDetermines whether or not an object is an integer
  • x - The object to test

True if the object can be converted to an integer and false if otherwise
IS_IPDetermine if a string is an IP or not
  • ip - An object which we wish to test is an IP

  • type (optional) - Object of string or collection type (for example, list) one of IPv4 or IPv6. The default is IPv4.

True if the string is an IP and false if otherwise
IS_URLTests if a string is a valid URL
  • url - The string to test

True if the string is a valid URL and false otherwise
JOINJoins the components in the list of strings with the specified delimiter
  • list - List of strings

  • delim - String delimiter

String
LENGTHReturns the length of a string or size of a collection. Returns 0 for empty or null strings.
  • input - Object of string or collection type (for example, list).

  • element - Element to add to list.

Integer
LIST_ADDAdds an element to a list.
  • list - List to add element to.

Resulting list with the item added at the end.
MAAS_GET_ENDPOINTInspects ZooKeeper and returns a map containing the name, version, and url for the model referred to by the input parameters
  • model_name - The name of the model

  • model_version - The optional version of the model. If the model version is not specified, the most current version is used.

A map containing the name, version, url for the REST endpoint (fields named name, version, and url). Note that the output of this function is suitable for input into the first argument of MAAS_MODEL_APPLY.
MAAS_MODEL_APPLYReturns the output of a model deployed via Model as a Service. Note: Results are cached locally 10 minutes.
  • endpoint - A map containing name, version, and url for the REST endpoint

  • function - The optional endpoint path; default is 'apply'

  • model_args - A dictionary of arguments for the model (these become request params)

The output of the model deployed as a REST endpoint in map form. Assumes REST endpoint returns a JSON map.
MAPApplies lambda expression to a list of arguments. e.g. `MAP( [ 'foo', 'bar' ] , (x) -> TO_UPPER(x) )` would yield `[ 'FOO', 'BAR' ]`.
  • string -List of arguments.

  • prefix - The string prefix to prepend to the start of the string.

  • additionalprefix - Optional - Additional string prefix that is valid.

A new String if prefix was prepended, the same string otherwise.
MAP_EXISTSChecks for existence of a key in a map
  • key - The key to check for existence

  • map - The may to check for existence of the key

True if the key is found in the map and false if otherwise
MAP_GETGets the value associated with a key from a map
  • key - The key

  • map - The map

  • default - Optionally the default value to return if the key is not in the map

The object associated with the key in the map. If no value is associated with the key and default is specified, then default is returned. If no value is associated with the key or default, then null is returned.
MONTHThe number representing the month. The first month, January, has a value of 0.
  • dataTime - The datetime as a long representing the milliseconds since UNIX epoch

The current month (0-based).
PREPEND_IF_MISSINGPrepends the prefix to the start of the string if the string does not already start with any of the prefixes.
  • string - The string to be prepended.

  • prefix - The string prefix to prepend to the start of the string.

  • additionalprefix - Optional - Additional string prefix that is valid.

A new String if prefix was prepended, the same string otherwise.
PROFILE_FIXEDThe profile periods associated with a fixed lookback starting from now
  • durationAgo - How long ago should values be retrieved from?

  • units - The units of 'durationAgo'

  • config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides.

The selected profile measurement timestamps. These are ProfilePeriod objects.
PROFILE_GETRetrieves a series of values from a stored profile
  • profile - The name of the profile

  • entity - The name of the entity

  • periods - The list of profile periods to grab. These are ProfilePeriod objects.

  • groups_list -Optional - Must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when creating the profile.

  • config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides.

The profile measurements
PROFILE_WINDOWThe profiler periods associated with a window selector statement from an optional reference timestamp.
  • WindowSelector - The statement specifying the window to select.

  • now - Optional - The timestamp to use for now.

  • config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter of the same name. Default is the empty Map, meaning no overrides.

Returns: The selected profile measurement periods. These are ProfilePeriod objects.
PROTOCOL_TO_NAMEConverts the IANA protocol number to the protocol name
  • IANA number

The protocol name associated with the IANA number
REDUCEReduces a list by a binary lambda expression. That is, the expression takes two arguments. Usage example: `REDUCE( [ 1, 2, 3 ] , (x, y) -> x + y, 0)` would sum the input list, yielding `6`.
  • list - List of arguments.

  • binary operation - The lambda expression function to apply to reduce the list. It is assumed that this takes two arguments, the first being the running total and the second being an item from the list.initial.

  • initial_value - The initial value to use.

The reduction of the list.

REGEXP_MATCHDetermines whether a regex matches a string
  • input -String to split

  • delim - String delimiter

List of strings
SPLITSplits the string by the delimiter
  • inputs - String to split

  • delim - String delimiter

List of strings
STARTS_WITHDetermines whether a string starts with a prefix
  • string -the string to test

  • prefix - The proposed prefix

True if the string starts with the specified prefix and false if otherwise
STATS_ADDAdd one or more input values to those that are used to calculate the summary statistics
  • stats - The Stellar statistics object. If null, then a new one is initialized

  • value+ - One or more numbers to add

A Stellar statistics object
STATS_BINComputes the bin that the value is in based on the statistical distribution.
  • stats - The Stellar statistics object

  • value - The value to bin

  • bound? - A list of percentile bin bounds (excluding min and max) or a string representing a known and common set of bins. For convenience, we have provided QUARTILE, QUINTILE, and DECILE which you can pass in as a string arg. If this argument is omitted, then we assume a Quartile bin split.

Which bin N the value falls in such that bound(N-1) < value <= bound(N). No min and max bounds are provided, so values smaller than the 0'th bound go in the 0'th bin, and values greater than the last bound go in the M'th bin.
STATS_COUNTCalculates the count of the values accumulated (or in the window if a window is used)
  • stats - The Stellar statistics object

The count of the values in the window or NaN if the statistics object is null
STATS_GEOMETRIC_MEANCalculates the geometric mean of the accumulated values (or in the window if a window is used). See http://commons.apache.org.proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The geometric mean of the values in the window or NaN if the statistics object is null
STATS_INITInitializes a statistics object
  • window_size - The number of input data values to maintain in a rolling window in memory. If window_size is equal to 0, then no rolling window is maintained. Using no rolling window is less memory intensive, but cannot calculate certain statistics like percentiles and kurtosis.

A Stellar statistics object
STATS_KURTOSISCalculates the kurtosis of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The kurtosis of the values in the window or NaN if the statistics object is null
STATS_MAXCalculates the maximum of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The maximum of the accumulated values in the window or NaN if the statistics object is null
STATS_MEANCalculates the mean of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The mean of the values in the window or NaN if the statistics objects is null
STATS_MERGEMerges statistics objects
  • statistics - A list of statistics providers

A Stellar statistics object
STATS_MINCalculates the minimum of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The minimum of the accumulated values in the window of NaN if the statistics object is null
STATS_PERCENTILEComputes the p'th percentile of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

  • p - A double where 0<=1 representing the percentile

The p'th percentile of the data or NaN if the statistics object is null
STATS_POPULATION_VARIANCECalculates the population variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The population variance of the values in the window of NaN if the statistics object is null
STATS_QUADATIC_MEANCalculates the quadratic mean of the accumulated values (or in the window if the window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The quadratic mean of the values in the window or NaN if the statistics object is null
STATS_SDCalculates the standard deviation of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The standard deviation of the values in the window or NaN if the statistics object is null
STATS_SKEWNESSCalculates the skewness of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The skewness of the values in the window of NaN if the statistics object is null
STATS_SUMCalculates the sum of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The sum of the values in the window or NaN if the statistics object is null
STATS_SUM_LOGSCalculates the sum of the (natural) log of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The sum of the (natural) log of the values in the in window or NaN if the statistics object is null
STATS_SUM_SQUARESCalculates the sum of the squares of the accumulated values (or in the window if a window is used)
  • stats - The Stellar statistics object

The sum of the squares of the values in the window or NaN if the statistics object is null
STATS_VARIANCECalculates the variance of the accumulated values (or in the window if a window is used). See http://commons.apache.org/proper/commons-math/userguide/stat.html#a1.2_Descriptive_statistics
  • stats - The Stellar statistics object

The variance of the values in the window or NaN if the statistics object is null
STRING_ENTROPYComputes the base-2 shannon entropy of a string.input - stringThe base-2 shannon entropy of the string (https://en.wikipedia.org/wiki/Entropy_(information_theory)#Definition). The unit of this is bits.
SYSTEM_ENV_GETReturns the value associated with an environment variable
  • env_var -Environment variable name to get the value for

String
SYSTEM_PROPERTY_GETReturns the value associated with a Java system property
  • key - Property to get the value for

String
TO_DOUBLETransforms the first argument to a double precision number
  • Input - Object of string or numeric type

Double version of the first argument
TO_EPOCH_TIMESTAMPReturns the epoch timestamp of the dataTime in the specified format. If the format does not have a timestamp and you wish to assume a given timestamp, you may specify the timezone optionally.
  • dateTime -DateTime in string format

  • format - DataTime format as string

  • timezone - Optional timezone in a string format

Epoch timestamp
TO_INTEGERTransforms the first argument to an integer
  • Input - Object of string or numeric type

Integer version of the first argument
TO_LOWERTransforms the first argument to a lowercase string
  • Input -String

String
TO_STRINGTransforms the first argument to a string
  • Input - Object

String
TO_UPPERTransforms the first argument to an uppercase string
  • Input -String

Uppercase string
TRIMTrims white space from both sides of a string
  • Input -String

String
URL_TO_HOSTExtract the hostname from a URL
  • url - URL in string form

The hostname from the URL as a string (for example URL_TO_HOST('http://www.yahoo.com/foo') would yield 'www.yahoo.com'
URL_TO_PATHExtract the path from a URL
  • url - URL in string form

The path from the URL as a string (for example URL_TO_PATH('http://www.yahoo.com/foo') would yield 'foo'
URL_TO_PORTExtract the port from a URL. If the port is not explicitly stated in the URL, then an implicit port is inferred based on the protocol.
  • url - URL in string form

The port used in the URL as an integer (for example URL_TO_PORT('http://www.yahoo.com/foo') would yield 80)
URL_TO_PROTOCOLExtract the protocol from a URL
  • url - URL in string form

The protocol from the URL as a string (for example URL_TO_PROTOCOL('http://www.yahoo.com/foo') would yield 'http'
WEEK_OF_MONTHThe numbered week within the month. The first week within the month has a value of 1.
  • dataTime -The datetime as a long representing the milliseconds since UNIX epoch

The numbered week within the month
WEEK_OF_YEARThe numbered week within the year. The first week in the year has a value of 1.
  • dateTime - The datetime as a long representing the milliseconds since UNIX epoch

The numbered week within the year
YEARThe number representing the year
  • dateTime -The datetime as a long representing the milliseconds since UNIX epoch

The current year