Install HDP

Use the following instructions to install HDP on your cluster hardware. Ensure that you specify the virtual machine (configured in the previous section) as your NameNode.

Step 1: Download Hortonworks Management Console (HMC) using the instructions provided here (TODO: Add xlink to HDP1-HMC doc-set).

Do not start the HMC server until you have configured the relevant templates as outlined in the following steps.

Step 2: Add the configuration parameters for Full-Stack HA.

  1. Edit the<master-install-machine-for-HDP>/etc/puppet/master/modules/hdp-hadoop/templates/hdfs-site.xml.erb file to add the following properties:

    • Enable the HDFS client retry policy.

      <property>
       <name>dfs.client.retry.policy.enabled</name>
       <value>true</value>
       <description> Enables HDFS client retry in case of NameNode failure.</description>
      </property>
    • Configure protection for NameNode edit log.

      <property>
       <name>dfs.namenode.edits.toleration.length</name>
       <value>8192</value>
       <description> Prevents corruption of NameNode edit log.</description>
      </property>
    • Configure safe mode extension time.

      <property>
       <name>dfs.safemode.extension</name>
       <value>10</value>
       <description> The default value (30 seconds) is applicable for very large clusters. For small to large clusters (upto 200 nodes), recommended value is 10 seconds.</description>
      </property>
    • Ensure that the allocated DFS blocks persist across multiple fail overs.

      <property>
       <name>dfs.persist.blocks</name>
       <value>true</value>
       <description> Ensure that the allocated DFS blocks persist across   multiple fail overs.</description>
      </property>
    • Configure delay for first block report.

      <property>
       <name>dfs.blockreport.initialDelay</name>
       <value>10</value>
       <description> Delay (in seconds) for first block report. </description>
      </property>
  2. Edit the<master-install-machine-for-HDP>/etc/puppet/master/modules/hdp-hadoop/templates/mapred-site.xml.erb file to add the following properties:

    • Enable the JobTracker’s safe mode functionality.

      <property>
       <name>mapreduce.jt.hdfs.monitor.enable</name>
       <value>true</value>
       <description> Enable the JobTracker to go into safe mode when the NameNode is unresponsive.</description>
      </property>
    • Enable retry for JobTracker clients (when the JobTracker is in safe mode).

      <property>
       <name>mapreduce.jobclient.retry.policy.enabled</name>
       <value>true</value>
       <description> Enable the MapReduce job client to retry job submission when the JobTracker is in safe
          mode.</description>
      </property>
    • Enable recovery of JobTracker’s queue after it is restarted.

      <property>
       <name>mapred.jobtracker.restart.recover</name>
       <value>true</value>
       <description>  Enable the JobTracker to recover its queue after JobTracker is restarted.</description>
      </property>
  3. Edit the<master-install-machine-for-HDP>/etc/puppet/master/modules/hdp-hadoop/templates/core-site.xml.erb file to add the following properties:

    • Configure checkpoint interval so that the checkpoint is performed on an hourly basis.

      <property>
       <name>fs.checkpoint.period</name>
       <value>3600</value>
       <description> The number of seconds between two periodic check­points.</description>
      </property>

 

Step 3: Complete HDP installation.

  • Continue the HMC installation process using the instructions provided here. TODO add link here

  • Ensure that you also follow the instructions listed below:

    • Use the fully qualified domain name (FQDN) of the virtual machine for configuring the host names (see: TODO add link here Prepare the Environment - Create hostdetail.txt). Note that HMC might not identify the NameNode VM automatically and it is therefore important to note the FQDN (IP address and DNS name) of the NameNode VM.

    • Specify shared storage for the NameNode’s directories (see: HMC - Step 6). TODO add link here

    • Do not use the NameNode VM for running any other master daemon.

  • Complete the HMC installation. Ensure that the installation was successful.

    [Note]Note

    To modify the parameters in the template files after you have installed HDP, ensure that you follow the instructions listed below:

    • Change the template files as required.

    • Stop and start the respective service through the HMC console GUI.

      For example, stop and restart the MapReduce service if the mapreduce-site.xml.erb file is modified. Stop and restart HDFS service if either the core-site.xml.erb or the hdfs-site.xml.erb file is modified.