2. Upgrade Hadoop

  1. On all nodes:

    • For RHEL/CentOS:

      yum upgrade hadoop*

    • For SLES:

      zypper update hadoop*

  2. Start HDFS.

    1. Start NameNode. On the NameNode host machine, execute the following command:

      sudo su -l $HDFS_USER -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start namenode -upgrade"
                             

    2. Start Secondary NameNode. On the Secondary NameNode host machine, execute the following command:

      sudo su -l $HDFS_USER -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start secondarynamenode"
                             

    3. Start DataNodes. On all the DataNodes, execute the following command:

      sudo su -l $HDFS_USER -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start datanode"
                             

      where $HDFS_USER is the HDFS Service user. For example, hdfs.

    4. Execute the following command on the NameNode machine:

      hadoop dfsadmin -safemode wait
    5. Track the status of the upgrade:

      hadoop dfsadmin -upgradeProgress status

      Continue tracking until you see:

      Upgrade for version -44 has been completed.
      Upgrade is not finalized.
      [Note]Note

      You finalize the upgrade later.

  3. The upgrade is now fully functional but not yet finalized. Using the finalize comand removes the previous version of the NameNode and DataNode's storage directories.

    [Important]Important

    Once the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    The upgrade must be finalized, however, before another upgrade can be performed.

    To finalize the upgrade:

    su $HDFS_USER
    hadoop dfsadmin -finalizeUpgrade

    where $HDFS_USER is the HDFS Service user. For example, hdfs.

  4. Start MapReduce.

    1. Start JobTracker. On the JobTracker host machine, execute the following command:

      sudo su -l $MAPRED_USER -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start jobtracker" 
                         

    2. Start JobHistory Server. On the JobHistory Server host machine, execute the following command:

      sudo su -l $MAPRED_USER -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start historyserver"
                         

    3. Start all TaskTrackers. On all the TaskTrackers, execute the following command:

      sudo su -l $MAPRED_USER -c \"/usr/lib/hadoop/bin/hadoop-daemon.sh start tasktracker\"
                         

      where $MAPRED_USER is the MapReduce Service user. For example, mapred.


loading table of contents...