Apache Ambari Major Upgrade
Also available as:
PDF

Upgrade Troubleshooting

Hive Post-upgrade Tasks

In the unlikely event the HDP express upgrade fails, contacting Hortonworks Support is strongly recommended. Alternatively, you can perform one or more of the following procedures to fix the upgrade.

Before you begin

To perform Hive post-upgrade tasks, you need Hive service user permissions or all permissions to access Hive provided by Ranger. If you use Kerberos, you need to start Hive as the Hive service user with a valid ticket. The Hive service user is usually the default hive user. If you don’t know which user is the Hive service user in your cluster, go to the Ambari Web UI and click Cluster Admin > Service Accounts, and then look for Hive User.

To perform some steps in this procedure, you also need to login as the HDFS superuser. If you use Kerberos, you need to become the superuser with a valid ticket.

Restoring the Hive Metastore

  1. On the node where the database for Hive Metastore resides, create databases you want to restore. For example:

    $ mysql -u hiveuser -p -e "create database <hive_db_schema_name>;"

  2. Restore each Metastore database from the dump you created. For example:

    $ mysql -u hiveuser -p <hive_db_schema_name> < </path/to/dump_file>

  3. Re-configure Hive Metastore if necessary. Reconfiguration might be necessary if the upgrade fails. Contacting Hortonworks Support for help with reconfiguration is recommended. Alternatively, in HDP 3.x, set key=value commands on the command line to configure Hive Metastore.

Hive 3 file locations

To locate and use your Apache Hive 3 tables after an upgrade, you need to understand the changes that occur during the upgrade process. Changes to the location of tables, permissions to HDFS directories, table types, formats, and ACID-compliance occur.

The /apps/hive directory, which is the former location of the Hive 2.x warehouse, does not, or should not, exist in HDP 3.x. After upgrading to HDP 3.x, tables should reside in the following locations:

  • /warehouse/tablespace/managed/hive

    /warehouse/tablespace/external/hive

Correcting Hive file locations

If you see an /apps/hive directory after upgrading, the upgrade procedure, for some reason, did not finish moving files from the old to the new HDP 3.x locations. You must correct the problem. Contacting Hortonworks Support is highly recommended. Alternatively, you can perform the following procedure to correct file locations:

  1. Login as the HDFS superuser.

    $ sudo su - hdfs

  2. Start Hive in Ambari 2.7.3.

  3. On a node in your cluster, start Beeline in the background and a Hive shell in the foreground:

    $ hive

    Output looks something like this:

    Connected to: Apache Hive (version 3.x)

    Driver: Hive JDBC (version 3.x)

    Transaction isolation: TRANSACTION_REPEATABLE_READ

    Beeline version 3.x by Apache Hive

  4. Change the location of the database and the table from the old location to the new location. For example:

    hive> ALTER DATABASE tpcds_bin_partitioned_orc_10 SET LOCATION 'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db';

    hive> ALTER TABLE tpcds_bin_partitioned_orc_10.store_sales SET LOCATION 'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db/store_sales';

  5. On the Hive Metastore node, log in as the HDFS superuser.

  6. Set STACK_VERSION to the HDP version you are running. For example:

    $ export STACK_VERSION=`hdp-select status hive-server2 | awk '{ print $3; }'`

  7. Run the following script:

    $ /usr/$STACK_VERSION/hive/bin/hive --config /etc/hive/conf --service strictmanagedmigration --hiveconf hive.strict.managed.tables=true -m automatic --modifyManagedTables --oldWarehouseRoot /apps/hive/warehouse

Recovering missing Hive tables

If any tables are missing after upgrading, contact Hortonworks Support for information about using the snapshots you took before upgrading to recover the missing tables. Alternatively, perform the following procedure to recover the tables:

  1. Login as Superuser HDFS. For example:

    $ sudo su - hdfs

  2. Read the snapshots on HDFS that backup your table data. For example:

    $ hdfs dfs -cat /apps/hive/warehouse/.snapshot/s20181204-164645.898/students/000000_0

    Example output for a trivial table having two rows and three columns might look something like this:

    fred flintstone351.28

    barney rubble322.32

  3. In Hive, insert the data into the table if the schema exists in the Hive warehouse; otherwise, restore the Hive Metastore, which includes the schemas, from the database dump you created in the pre-upgrade process.

Securing Hive tables in HDP 3.x

The major HDP 3.x authorization model in Hive is Ranger. This model permits only Hive to access HDFS. Hive enforces access controls specified in Ranger. This model offers stronger security than other security schemes and more flexibility in managing policies. If you used a security system other than Ranger in HDP 2.x to protect your Hive data, contact Hortonworks support about converting to Ranger. If you do not enable the Ranger security service, or other security, by default Hive uses storage-based authorization (SBA) based on user impersonation. Managed tables in HDP 3.x have file ownership set to hive and directory permissions set to 700. In HDP 3.x, SBA relies heavily on HDFS access control lists (ACLs). ACLs are an extension to the permissions system in HDFS. HDP 3.x turns on ACLs in HDFS by default. For more information about HDFS ACLs, see HDFS ACL Permissions Model.

State of Hive tables before and after upgrading

Hive 2.x and 3.x have transactional and non-transactional tables. Transactional tables have atomic, consistent, isolation, and durable (ACID) properties. In Hive 2.x, the initial version of ACID transaction processing is ACID v1. In Hive 3.x, the mature version of ACID is ACID v2, which is the default table type in HDP 3.0.

The following factors determine the features of a table after upgrading from Hive 2.x to 3.x:

  • ACID (transactional) or no ACID (non-transactional)

  • Security: storage-based authorization (SBA) or other, such as Ranger

  • Storage format of a table

Hive supports the following Hadoop native and non-native storage formats:

  • Native: Tables with built-in support in Hive, such as those in the following file formats:

    • Text

    • Sequence File

    • RC File

    • AVRO File

    • ORC File

    • Parquet File

  • Non-native: Tables that use a storage handler, such as the DruidStorageHandler or HBaseStorageHandler

The following table compares a number of factors before an upgrade from HDP 2.x and after an upgrade to HDP 3.x: table types, ACID operations, storage format, and security, such as storage-based authentication (SBA), no Ranger.

Table 4.1. HDP 2.x and 3.x Table Type Comparision

HDP 2.xHDP 3.x
Table TypeACID v1FormatSecurityTable TypeACID v2Format
ExternalNoNative or non-nativeN/AExternalNoNative or non-native
ManagedYesORCN/AManaged, updatableYesORC
ManagedNoORCNo SBAManaged, updatableYesORC
ManagedNoNative (but non-ORC)No SBAManaged, insert onlyYesNative
ManagedNoNative (including ORC)Some type of security, such as doAs=true plus SBAExternalNoNative
ManagedNoNon-nativeN/aExternalNoNative


YARN Registry DNS instance fails to start

The YARN Registry DNS instance will fail to start if another process on the host is bound to port 53. Please ensure no other services that are binding to port 53 are on the host where the YARN Registry DNS instance is deployed.

Class Loading Issue When Starting Solr

If you do not follow sequential steps during the upgrade, the Infra Solr instance may fail to start with the following exception:

null:org.apache.solr.common.SolrException: Error loading class
​'org.apache.solr.security.InfraRuleBasedAuthorizationPlugin'

If you see this exception, follow the steps in this HCC article to work around the issue:

https://community.hortonworks.com/content/supportkb/210579/error-nullorgapachesolrcommonsolrexception-error-l.html

If Ambari Metrics System (AMS) does not start after upgrade, the following-log snippet is what can be observer in the HBase Master when this problem occurs:

master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN,
ts=1543610616273, server=regionserver1.domain.com,41213,1543389145213}; ServerCrashProcedures=true.
Master startup cannot progress, in holding-pattern until region onlined

The workaround is to manually clean up the znode from ZooKeeper.

  • If AMS mode = embedded, Remove the znode data from local filesystem path, e.g.:

     rm -f /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/zookeeper_0/version-2/*
    
  • If AMS mode = distributed, connect to the cluster zookeeper instance and delete the following node before restart:

     /usr/hdp/current/zookeeper-client/bin/zkCli.sh -server localhost:2181
    [zk: localhost:2181(CONNECTED) 0] rmr /ams-hbase-unsecure/meta-region-server