4. Upgrading the Stack

  1. Upgrade the HDP repository on all hosts and replace the old repo file with the new file:

    [Important]Important

    The file you download is named hdp.repo. To function properly in the system, it must be named HDP.repo. Once you have completed the "mv" of the new repo file to the repos.d folder, make sure there is no file named hdp.repo anywhere in your repos.d folder.

    • For RHEL/CentOS 5

      wget  http://public-repo-1.hortonworks.com/HDP/centos5/1.x/GA/1.3.0.0/hdp.repo 
      mv hdp.repo /etc/yum.repos.d/HDP.repo
    • For RHEL/CentOS 6

      wget http://public-repo-1.hortonworks.com/HDP/centos6/1.x/GA/1.3.0.0/hdp.repo 
      mv hdp.repo /etc/yum.repos.d/HDP.repo
    • For SLES 11

      wget  http://public-repo-1.hortonworks.com/HDP/suse11/1.x/GA/1.3.0.0/hdp.repo 
      
      mv hdp.repo /etc/zypp/repos.d/HDP.repo
  2. Upgrade the stack on all Agent hosts. Skip any components your installation does not use:

    • For RHEL/CentOS

      1. Upgrade the following components:

        yum upgrade "collectd*" "epel-release*" "gccxml*" "pig*" "hadoop*" "sqoop*"
              "zookeeper*" "hbase*" "hive*" "hcatalog*" "webhcat-tar*" oozie-client 
              hdp_mon_nagios_addons hdp_mon_ganglia_addons
      2. Upgrade Oozie:

        rpm -e --nopostun oozie-$old_version_number 
        yum install oozie            
    • For SLES

      [Important]Important

      When removing and installing any packages, rename those files that have .rpmsave extensions to their original name to retain any customized configs you may have made. Or you can use the configuration files you backed up before upgrading.

      1. Upgrade the following components:

        zypper up collectd epel-release* gccxml* pig* hadoop* sqoop* hive* hcatalog* webhcat-tar* 
                  oozie-client hdp_mon_nagios_addons* hdp_mon_ganglia_addons
        yast --update hadoop hcatalog hive
      2. Upgrade ZooKeeper and HBase:

        zypper update zookeeper-3.4.5.1.3.0.0
        zypper remove zookeeper
        zypper se -s zookeeper

        You should see ZooKeeper v3.4.5.1.3.0.0 in the output.

        Install ZooKeeper v3.4.5.1.3.0.0:

        zypper install zookeeper-3.4.5.1.3.0.0

        This command also uninstalls HBase. Now use the following to install HBase:

        zypper install hbase-0.94.6.1.3.0.0
        zypper update hbase
      3. Upgrade Oozie:

        rpm -e --nopostun oozie-$old_version_number
        zypper update oozie-3.3.2.1.3.0.0
        zypper remove oozie
        zypper se -s oozie 

        You should see Oozie v3.3.2.1.3.0.0 in the output.

        Install Oozie v3.3.2.1.3.0.0:

        zypper install oozie-3.3.2.1.3.0.0
      4. Upgrade Flume:

        zypper update flume-1.3.1.1.3.0.0
        zypper remove flume
        zypper se -s flume 

        You should see Flume v1.3.1.1.3.0.0 in the output.

        Install Flume v1.3.1.1.3.0.0:

        zypper install flume-1.3.1.1.3.0.0 
      5. Upgrade Mahout:

        zypper remove mahout
        zypper se -s mahout 

        You should see Mahout v0.7.0.1.3.0.0 in the output.

        Install Mahout v0.7.0.1.3.0.0:

        zypper install mahout-0.7.0.1.3.0.0
  3. Restart services. On all hosts in your cluster:

    • For RHEL/CentOS

      service httpd restart
    • For SLES:

      service apache2 restart
  4. Start the Ambari Server. On the Server host:

    ambari-server start
  5. Start each Ambari Agent. On all Agent hosts:

    ambari-agent start
  6. Because the file system version has now changed you must start the NameNode manually:

    sudo su -l hdfs -c "/usr/lib/hadoop/bin/hadoop-daemon.sh start namenode -upgrade"
  7. Track the status of the upgrade:

    hadoop dfsadmin -upgradeProgress status

    Continue tracking until you see

    Upgrade for version -44 has been completed.
    Upgrade is not finalized.
    [Note]Note

    You finalize the upgrade later.

  8. Use Services View on Ambari Web to start the HDFS service. This starts the SecondaryNameNode and the DataNodes.

  9. After the DataNodes are started, HDFS exits safemode. To monitor the status:

    hadoop dfsadmin -safemode get

    When HDFS exits safemode, this is displayed

    Safe mode is OFF
  10. Make sure that the HDFS upgrade was successful. Go through steps 2 and 3 in Section 9.1 to create new versions of the logs and reports. Substitute "new" for "old" in the file names as necessary

  11. Compare the old and new versions of the following:

    • dfs-old-fsck-1.log versus dfs-new-fsck-1.log.

      The files should be identical unless the hadoop fsck reporting format has changed in the new version.

    • dfs-old-lsr-1.log versus dfs-new-lsr-1.log.

      The files should be identical unless the the format of hadoop fs -lsr reporting or the data structures have changed in the new version.

    • dfs-old-report-1.log versus fs-new-report-1.log

      Make sure all DataNodes previously belonging to the cluster are up and running.

  12. Use the Services View on Ambari Web to start all services in the following order:

    1. HDFS

    2. Ganglia

    3. Nagios

    4. ZooKeeper

    5. MapReduce

    6. HBase

    7. Hive

    8. Oozie

    9. WebHCat

  13. The upgrade is now fully functional but not yet finalized. Using the finalize comand removes the previous version of the NameNode and DataNode's storage directories.

    [Important]Important

    Once the upgrade is finalized, the system cannot be rolled back. Usually this step is not taken until a thorough testing of the upgrade has been performed.

    The upgrade must be finalized, however, before another upgrade can be performed.

    To finalize the upgrade:

    su $HDFS_USER
    hadoop dfsadmin -finalizeUpgrade

    where $HDFS_USER is the HDFS Service user (by default, hdfs).


loading table of contents...