11.5. Known Issues for Hive - Hortonworks Data Platform

BUG-19152: Upon start, HiveServer2 doesn't know about the admin users in hive.users.in.admin.role for a while

Problem: When HiveServer2 is started it takes a while for it to initialize the users set in the hive.users.in.admin.role property. This causes the first few tests in Hive SQL Standard Auth test suite to fail.

BUG-18002: NullPointerException in OrcInputFormat when Vectorization is turned on

Problem: We are unable to verify if the NPE occurs with the normal expected settings.

BUG-17850: HBCK test fails intermittently due to Empty Region Qualifier error

BUG-17846: Hive query using SUM() windowing function fails to complete and stays stuck on reduce task

Problem: The query never completes.

BUG-17247: In Hive Cli switching the hive.execution.engine from Tez to MapReduce does not also switch the YARN framework back to MapReduce

Problem: If we can't switch the YARN framework back to MR, Hive MR will still run on Tez.

BUG-16802: Hive on Tez query passes, but the application is in the killed state.

Problem: The Hive session should shut down cleanly and not kill the app.

BUG-16667: Alter index rebuild fails with FS-based stats gathering.

Problem: We force create_index to run in MR mode when we have a TEZ run, but it is failing intermittently. (This problem is not seen on non-Tez runs.)

BUG-16476: Oozie-hive tests run as hadoopqa creates/accesses the /tmp/hive-hadoop folder

Problem: The issue occurs because Oozie launches the Hive client as the mapreduce user (hadoop in this case). However, the ugi information is that of the user using Oozie (hadoopqa in this case). Therefore, Hive always creates the /tmp/hive-hadoop directory for use as a scratch directory with hadoopqa as the owner. The right fix for this would be to create user specific directories in the first place and should be addressed in HIVE-6782.

Workaround: Either wipe out the directory or to set permissions of 777 on the directory.

BUG-16393: Bucketized Table feature fails in some cases.

Problem: Bucketized Table feature fails in some cases. If the source and destination are bucketed on the same key, and if the actual data in the source is not bucketed (because the data got loaded using LOAD DATA LOCAL INPATH) then the data won't be bucketed while writing to the destination.

BUG-16257: HBase master fails to start due to BindException

Problem: HBase on Suse 11 64 bit, Smoke test fails Intermittent with ERROR main client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. Basically HBase default ports clash with the range Linux assigns itself for creating come-and-go ephemeral ports. Therefore, once in awhile we'll see HBase master can't start due to port binding issue.

BUG-15733: (HIVE-7071) Schema evolution is broken on Tez.

Problem: The error returned on the Hive console is:

Here is the error in the Hive console log: 
Vertex failed, vertexName=Map 1, vertexId=vertex_1395920136483_7733_1_00, diagnostics=[Task failed, taskId=task_1395920136483_7733_1_00_000000, diagnostics=[AttemptID:attempt_1395920136483_7733_1_00_000000_0 Info:Error: java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable 
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) 
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) 
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344) 
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) 
at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) 
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122) 
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:122) 
at org.apache.tez.mapreduce.input.MRInput$MRInputKVReader.next(MRInput.java:510) 
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:158) 
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) 
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:306) 
at org.apache.hadoop.mapred.YarnTezDagChild$4.run(YarnTezDagChild.java:549) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:396) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) 
at org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:538) 
Caused by: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable 
at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:44) 
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339) 
... 13 more

BUG-16476: Oozie-hive tests run as hadoopqa creates/accesses /tmp/hive-hadoop folder

Problem: The issue occurs because Oozie launches the Hive client as the mapreduce user (hadoop in this case). However, the ugi information is that of the user using Oozie (hadoopqa in this case), so Hive always creates the /tmp/hive-hadoop directory for use as a scratch directory with hadoopqa as the owner. The right fix for this would be to create user specific directories in the first place and should be addressed in HIVE-6782.

Workaround: Either wipe out the directory or to set permissions of 777 on the directory.

BUG-15003: Hive sink throws exception on shutdown

Problem: When using the Hive sink in Flume, you are likely to see the below warning in the logs followed by a stack trace when shutting down the Flume agent:

                          14/03/16 17:39:07 WARN hive.HiveSink: Exception while closing HiveEndPoint ...

There is no current evidence that this exception indicates data loss.

BUG-14986: Region assignments for large number of regions may cause timeouts on windows

Problem: On the Windows env, after creating a table with replicas and calling the Load Balancer, the Load Balancer does not run and throws RegionAlreadyInTransitionException in the master logs.

BUG-13796: When running with correlation optimization enabled on Tez, TPCDS queries 1, 32, 94, 95 and 97 fail with ClassCastException.

BUG-13551: Oozie does not understand _HOST in the kerberos principal name

Problem: Oozie currently expects the actual hostname in the kerberos principal. This is unlike other services in the stack where we can just send _HOST and the service at run time will replace _HOST with machine hostname. This is important so that in a HA setup we can push the same configs to all oozie servers.

BUG-10512: Streaming / SELECT TRANSFORM doesn't work with Tez

Problem: SELECT TRANSFORM doesn't work with Tez enabled, works in same build with Tez disabled.

BUG-8227: (HIVE-6638) Hive needs to implement recovery or extend FileOutputComitter.

Problem: When running Hive jobs and restarting RM, Hive jobs start again from scratch, causing the job to fail after the maximum number of retries. OutputComitter defaults recovery to false (see below). Hive needs to implement recovery or move to extending FileOutputComitter.

public boolean isRecoverySupported() { 
        return false;