1.2. Alert Definitions and Instances

An Alert Definition includes name, description and check interval, as well as configurable thresholds for each status (depending on the Alert Type).

The following table lists the types of alerts, their possible status and if the thresholds are configurable:

Alert Types

Type

Description

Status

Thresholds Configurable

Units

PORT

Watches a port based on a configuration property as the URI. Example: Hive Metastore Process

OK, WARN, CRIT

Yes

seconds

METRIC

Watches a metric based on a configuration property. Example: ResourceManager RPC Latency

OK, WARN, CRIT

Yes

variable

AGGREGATE

Aggregate of status for another alert definition. Example: percentage NodeManagers Available

OK, WARN, CRIT

Yes

percentage

WEB

Watches a Web UI and adjusts status based on response. Example: App Timeline Web UI

OK, WARN, CRIT

No

n/a

SCRIPT

Uses a custom script to handle checking. Example: NodeManager Health Summary

OK, CRIT

No

n/a


loading table of contents...