6. Deploy HDP HA Configurations

Use the instructions provided in this section to configure Full-Stack HA fail over resiliency for the HDP clients.

[Note]Note

Your Hadoop configuration directories are defined during the HDP installation. For details, see: Setting Up Hadoop Configuration.

Step 1: Edit the $HADOOP_CONF_DIR/hdfs-site.xml file to add the following properties:

  • Enable the HDFS client retry policy.

    <property>
     <name>dfs.client.retry.policy.enabled</name>
     <value>true</value>
     <description> Enables HDFS client retry in case of NameNode failure.</description>
    </property>
        
  • Configure protection for NameNode edit log.

    <property>
     <name>dfs.namenode.edits.toleration.length</name>
     <value>8192</value>
     <description> Prevents corruption of NameNode edit log.</description>
    </property>
        
  • Configure safe mode extension time.

    <property>
     <name>dfs.safemode.extension</name>
     <value>10</value>
     <description> The default value (30 seconds) is applicable for very large clusters. For small to large clusters (upto 200 nodes), recommended value is 10 seconds.</description>
    </property>
        
  • Ensure that the allocated DFS blocks persist across multiple fail overs.

    <property>
     <name>dfs.persist.blocks</name>
     <value>true</value>
     <description>Ensure that the allocated DFS blocks persist across multiple fail overs.</description>
    </property>
        
  • Configure delay for first block report.

    <property>
     <name>dfs.blockreport.initialDelay</name>
     <value>10</value>
     <description> Delay (in seconds) for first block report.</description>
    </property>
        

Step 2: Modify the following property in the $HADOOP_CONF_DIR/core-site.xml file:

<property>
 <name>fs.checkpoint.period</name>
 <value>3600</value>
 <description> The number of seconds between two periodic checkpoints.</description>
</property>
    

This will ensure that the checkpoint is performed on an hourly basis.

Step 3: Edit the $HADOOP_CONF_DIR/mapred-site.xml file to add the following properties:

  • Enable the JobTracker’s safe mode functionality.

    <property>
     <name>mapreduce.jt.hdfs.monitor.enable</name>
     <value>true</value>
     <description> Enable the JobTracker to go into safe mode when the NameNode is not responding.</description>
    </property>
        
  • Enable retry for JobTracker clients (when the JobTracker is in safe mode).

    <property>
     <name>mapreduce.jobclient.retry.policy.enabled</name>
     <value>true</value>
     <description> Enable the MapReduce job client to retry job submission when the JobTracker is in safe mode.</description>
    </property>
        
  • Enable recovery of JobTracker’s queue after it is restarted.

    <property>
     <name>mapred.jobtracker.restart.recover</name>
     <value>true</value>
     <description> Enable the JobTracker to recover its queue after it is restarted.</description>
    </property>
        

loading table of contents...