This alert is triggered if the ZooKeeper process cannot be determined to be up and
listening on the network for the configured critical threshold, given in seconds. It uses
the Nagios check_tcp
plugin.
The Nagios server cannot connect to one or more ZooKeeper processes
The ZooKeeper hosts are down
The ZooKeeper processes are not down but are not listening to the correct network port/address
Check the list of live/dead DataNodes by following the DataNodes (live/dead/decom) link on the Cluster Summary section of the Dashboard
Check for any errors in the ZooKeeper logs (
/var/log/hadoop/zookeeper
) and restart the ZooKeeper hosts/processesRun the
netstat-tuplpn
command to check if the ZooKeeper process is bound to the correct network port.