Verify that NameNode failure triggers the fail over
  1. Start the NameNode VM and run the HAM application configured to work with this NameNode.

  2. In HAM, start blocking LS operations.

  3. SSH to the NameNode VM and terminate the NameNode process.

     service hadoop-namenode stop

    Alternatively, identify the NameNode process (jps -v) and issue kill -9 command.

  4. Ensure that you see the following expected results:

    • In HAM, the NameNode status area (at the top of the application) should display offline status for NameNode. The main area should also stop displaying any new text (this indicates that the file system operations are blocked).

    • In the vSphere Management UI, the vSphere should terminate the NameNode VM within 60-90 seconds and must start a new instance.

    • Once the NameNode service restarts, its status must be displayed in both the vSphere UI and in the status indicator of HAM.

    • The blocking operations started in HAM must now continue. The fail over should not affect the client except for the pause during fail over.

    • SSH to the NameNode VM again and verify that the host name, IP address, and SSH host key have not changed.