The hdfs haadmin
command can be used to manually failover a NameNode, and to
check NameNode status.
Running the hdfs haadmin
command without any arguments returns a list of the
available subcommands:
Usage: DFSHAAdmin [-ns <nameserviceId>] [-transitionToActive <serviceId>] [-transitionToStandby <serviceId>] [-failover [--forcefence] [--forceactive] <serviceId> <serviceId>] [-getServiceState <serviceId>] [-checkHealth <serviceId>] [-help <command>
This section provides a description of each of these subcommands.
failover
Initiates a failover between two NameNodes.
Usage:
hdfs haadmin -failover <target standby (service id of namenode)> <target active (service id of namenode)>
This subcommand causes a failover from the first provided NameNode to the second. It can be used for testing or when hardware maintenance needs to be performed on the active machine.
If the first NameNode is in the Active state, an attempt will be made to gracefully transition it to the Standby state. If this fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be attempted in order until one succeeds. Only after this process will the second NameNode be transitioned to the Active state. If the fencing methods fail, the second NameNode is not transitioned to Active state and an error is returned. For more information about configuring fencing methods, see Configure NameNode HA Cluster.
If the first NameNode is in the Standby state, the command will execute and say the failover is successful, but no failover occurs and the two NameNodes will remain in their original states.
You can use the
/etc/hadoop/conf/hdfs-site.xml
file to determine theserviceId
of each NameNode. In a High Availability (HA) cluster, the NameNode ID can be found in thedfs.namenode.rpc-address.[$nameservice ID].[$name node ID]
property. The following is an example of NameNodes with corresponding serviceId valuesnn1
andnn2
:<property> <name>dfs.namenode.rpc-address.myHAcluster.nn1</name> <value>63hdp20ambari252.example.com:8020</value> </property> <property> <name>dfs.namenode.rpc-address.myHAcluster.nn2</name> <value>63hdp20ambari251.example.com:8020</value> </property>
Example:
hdfs haadmin -failover nn1 nn2
For the preceding example, assume that
nn1
was Active andnn2
was Standby prior to running the failover command. After the command is executed,nn2
will be Active andnn1
will be Standby.If
nn1
is Standby andnn2
is Active prior to running the failover command, the command will execute and say the failover is successful, but no failover occurs andnn1
andnn2
will remain in their original states.getServiceState
Determines whether a given NameNode is in an Active or Standby state.
This subcommand connects to the referenced NameNode, determines its current state, and prints either "standby" or "active" to
STDOUT
. This subcommand might be used by cron jobs or monitoring scripts.Usage:
hdfs haadmin -getServiceState <serviceId of NameNode>
Example
hdfs haadmin -getServiceState nn1
checkHealth
Checks the health of a given NameNode.
This subcommand connects to the referenced NameNode and checks its health. The NameNode is capable of performing diagnostics that include checking to see if internal services are running as expected. This command will return 0 if the NameNode is healthy; otherwise it will return a non-zero code.
Note This subcommand is in implementation phase and currently always returns success (0) unless the given NameNode is down.
Usage:
hdfs haadmin -checkHealth <serviceId of NameNode>
Example
hdfs haadmin -checkHealth nn1
transitionToActive
andtransitionToStandby
Transitions the state of the given NameNode to Active or Standby.
Usage:
hdfs haadmin -transitionToActive <serviceId of NameNode>
hdfs haadmin --transitionToStandby <serviceId of NameNode>
Note These commands do not attempt to perform any fencing, and therefore are not recommended. Instead, Hortonworks recommends using the
-failover
subcommand described above.