Configuring Fault Tolerance
Also available as:
PDF
loading table of contents...

Perform a Backup of the HDFS Metadata

You can back up HDFS metadata without affecting the availability of NameNode.

  1. Make sure the Standby NameNode checkpoints the namespace to fsimage_ once per hour.
  2. Deploy monitoring on both NameNodes to confirm that checkpoints are triggering regularly.
    This helps reduce the amount of missing transactions in the event that you need to restore from a backup containing only fsimage files without subsequent edit logs. It is good practice to monitor this because edit logs that are large in size and without checkpoints can cause long delays after a NameNode restart while it replays those transactions.
  3. Back up the most recent “fsimage_*” and “fsimage_*.md5” from the standby NameNode periodically.
    Try to keep the latest version of the file on another machine in the cluster.
  4. Back up the VERSION file from the standby NameNode.