Prepare to Back Up the HDFS Metadata
Regardless of the solution, a full, up-to-date continuous backup of the namespace is not possible. Some of the most recent data is always lost. HDFS is not an Online Transaction Processing (OLTP) system. Most data can be easily recreated if you re-run Extract, Transform, Load (ETL) or processing jobs.
Normal NameNode failures are handled by the Standby NameNode. Doing so creates a safety-net for the very unlikely case where both master NameNodes fail.
In the case of both NameNode failures, you can start the NameNode service with the most recent image of the namespace.
Name Nodes maintain the namespace as follows:
Standby NameNodes keep a namespace image in memory based on
editsavailable in a storage ensemble in Journal Nodes.
Standby NameNodes make a namespace checkpoint and saves an
Standby NameNodes transfer the
fsimageto the primary NameNodes using HTTP.
Both NameNodes write
fsimages to disk in the following
NameNodes write the namespace to a file
NameNodes creates an
NameNodes moves the file
The process by which both NameNodes write
fsimages to disk
The most recent namespace image on disk in an
fsimage_*file is on the standby NameNode.
fsimage_*file on disk is finalized and does not receive updates.