Apache Ambari Upgrade for IBM Power Systems
Also available as:
PDF

Prepare Hive for upgrade

Before upgrading the cluster to HDP 3.0.0, you must prepare Hive for the upgrade. The Hive pre-upgrade tool is designed to help you upgrade Hive 2 in HDP 2.6.5 and later to Hive 3 in HDP 3.0.0. Upgrading Hive in releases earlier than HDP 2.6.5 is not supported.

Some tables require major compaction before the upgrade to ensure that Hive 3 can read your tables after the upgrade. You run the pre-upgrade tool to perform the following compaction-related tasks:

  • Identify which tables or partitions need to be compacted.

  • Generate a script that you can execute in Beeline to run the compactions.

Compaction can be time-consuming and resource-intensive. You can examine the script generated by running the pre-upgrade tool to gauge how long the actual upgrade might take.

Before you begin

  • Ensure that the Hive Metastore is running. Connectivity between the tool and Hive MetaStore is mandatory.

  • Optionally, shut down HiveServer2. Shutting down HiveServer2 is recommended, but not required, to prevent operations on ACID tables while the tool executes.

  • The pre-upgrade tool might submit compaction jobs, so ensure that the cluster has sufficient capacity to execute those jobs. Set the hive.compactor.worker.threads property to accommodate your data.

  • In kerberized cluster, enter kinit before executing the pre-upgrade tool command.

    When you run the pre-upgrade tool command, you might need to set -Djavax.security.auth.useSubjectCredsOnly=false in a Kerberized environment if you see the following types of errors after running kinit:

    org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Download the pre-upgrade tool JAR

  1. SSH into the host running the Hive Metastore. You can locate this host by going to the Ambari Web UI and clicking Hosts. Click on the Filter icon and type in “Hive Metastore: All” to find each host that has a Hive Metastore Instance installed on it.

  2. Change directories to the /tmp directory.

  3. Execute the following command to download the pre-upgrade tool JAR:

    wget http://repo.hortonworks.com/content/repositories/releases/org/apache/hive/hive-pre-upgrade/3.1.0.3.0.0.0-1634/hive-pre-upgrade-3.1.0.3.0.0.0-1634.jar

Procedure for preparing Hive for upgrading

  1. Export the JAVA_HOME environment variable if necessary.

    export JAVA_HOME=[ path to your installed JDK ]

  2. Generate scripts for compaction by running the pre-upgrade tool command:

    cd <location of downloaded pre-upgrade tool>

    $JAVA_HOME/bin/java -Djavax.security.auth.useSubjectCredsOnly=false -cp /usr/hdp/current/hive2/lib/derby-10.10.2.0.jar:/usr/hdp/current/hive2/lib/:/usr/hdp/current/hadoop/:/usr/hdp/current/hadoop/lib/:/usr/hdp/current/hadoop-mapreduce/:/usr/hdp/current/hadoop-mapreduce/lib/:/usr/hdp/current/hadoop-hdfs/:/usr/hdp/current/hadoop-hdfs/lib/*:/usr/hdp/current/hadoop/etc/hadoop/:/tmp/hive-pre-upgrade-3.1.0.3.0.0.0-1634.jar:/usr/hdp/current/hive/conf/conf.server org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool

  3. Examine the scripts in the output to understand what running the scripts does. (Optional)

  4. Login to Beeline as the Hive service user, and run each generated script to prepare the cluster for upgrading.

    The Hive service user is usually the hive user. This is hive by default. If you don’t know which user is the Hive service user in your cluster, go to the Ambari Web UI and click Cluster Admin > Service Accounts, and then look for Hive User.

  5. Prevent the execution of update, delete, or merge statements against transactional tables until the upgrade is complete.

Repeat the compaction if an update, delete, or merge occurs before you upgrade HDP.

Pre-upgrade tool command options

You can use the following key options with the pre-upgrade tool command:

  • -execute

    Use this option only when you want to run the pre-upgrade tool command in Ambari instead of on the Beeline command line. Using Beeline is recommended. This option automatically executes the equivalent of the generated commands.

  • -location

    Use this option to specify the location to write the scripts generated by the pre-upgrade tool.

Running the help for the pre-upgrade command shows you all the command options:

  1. cd <location of downloaded pre-upgrade tool>

  2. Append --helpto the command. For example:

    $JAVA_HOME/bin/java -Djavax.security.auth.useSubjectCredsOnly=false -cp /usr/hdp/current/hive2/lib/derby-10.10.2.0.jar:/usr/hdp/current/hive2/lib/:/usr/hdp/current/hadoop/:/usr/hdp/current/hadoop/lib/:/usr/hdp/current/hadoop-mapreduce/:/usr/hdp/current/hadoop-mapreduce/lib/:/usr/hdp/current/hadoop-hdfs/:/usr/hdp/current/hadoop-hdfs/lib/*:/usr/hdp/current/hadoop/etc/hadoop/:/tmp/hive-pre-upgrade-3.1.0.3.0.0.0-1634.jar:/usr/hdp/current/hive/conf/conf.server org.apache.hadoop.hive.upgrade.acid.PreUpgradeTool --help

Output is:

usage: upgrade-acid

-execute Executes commands equivalent to generated scripts

-help Generates a script to execute on 2.x cluster. This requires 2.x binaries on the classpath and hive-site.xml.

-location <arg> Location to write scripts to. Default is CWD.

In a Kerberized environment, if you see the errors described above after running kinit, include the following option when you run the pre-upgrade tool command:

-Djavax.security.auth.useSubjectCredsOnly=false

Register and Install Target Version