3. Validating HA Configuration

  1. Verify the state of each NameNode, using one the following methods:

    • Open the web page for each NameNode in a browser, using the configured URL.

      The HA state of the NameNode should appear in the configured address label. For example: NameNode 'example.com.8020' (standby) .

      [Note]Note

      The NameNode state may be "standby" or "active". After bootstrapping, the HA NameNode state is intially "standby".

    • Query the state of a NameNode, using JMX(tag.HAState)

    • Query the service state, using the following command:

      hdfs haadmin -getServiceState
  2. Verify automatic failover.

    1. Locate the Active NameNode.

      Use the NameNode web UI to check the status for each NameNode host machine.

    2. Cause a failure on the Active NameNode host machine.

      1. Turn off automatic restart of the service.

        1. In Windows Services pane, locate the Apache Hadoop NameNode service, right-click, and choose Properties.

        2. On the Recovery tab, select Take No Action for First, Second, and Subsequent Failures, then choose Apply.

      2. Simulate a JVM crash.

        For example, you can use the following command to simulate a JVM crash:

        'taskkill.exe /t /f /im namenode.exe'

        Alternatively, power-cycle the machine, or unplug its network interface to simulate outage.

        The Standby NameNode state should become Active within several seconds.

        [Note]Note

        The time required to detect a failure and trigger a failover depends on the configuration of ha.zookeeper.session-timeout.ms property. The default value is 5 seconds.

      3. Verify that the Standby NameNode state is Active.

        1. If a standby NameNode does not activate, verify that HA settings are configured correctly.

        2. Check log files for zkfc daemons and NameNode daemons to diagnose issues.


loading table of contents...