7.2. Known Issues for Hive - Hortonworks Data Platform

BUG-17088: HiveServer2Concurr test fails intermittently throwing org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed

Problem: When we open two different connections at the same time, it results in a replay error from the GSS-API. From kerberos v5 RFC (http://www.ietf.org/rfc/rfc4120.txt): Implementation note: If a client generates multiple requests to the KDC with the same timestamp, including the microsecond field, all but the first of the requests received will be rejected as replays. This might happen, for example, if the resolution of the client's clock is too coarse. Client implementations SHOULD ensure that the timestamps are not reused, possibly by incrementing the microseconds field in the time stamp when the clock returns the same time for multiple requests.

Workaround: Add some delay on the client side when issuing multiple concurrent request. One way to achieve this is to synchronize the connection creation on the client side, which will introduce the required delay.

BUG-10248: java.lang.ClassCastException while running a join query

Problem: when a self join is done with 2 or more columns of different data types. For example: join tab1.a = tab1.a join tab1.b=tab1.b and a and b are different data types. a is double and b is a string for e.g.. Now b cannot be cast into a double. It shouldn't have attempted to use the same serialization for both columns.

Workaround:Set the hive.auto.convert.join.noconditionaltask.size to a value such that the joins are split across multiple tasks.

BUG-5221:Hive Windowing test Ordering_1 fails

Problem: While executing the following query:

select s, avg(d) over (partition by i order by f, b) from over100k;

the following error is reported in the Hive log file:

FAILED: SemanticException Range based Window Frame can have only 1 Sort Key

Workaround: The workaround is to use the following query:

select s, avg(d) over (partition by i order by f, b rows unbounded preceding) from over100k;

BUG-5220:Hive Windowing test OverWithExpression_3 fails

Problem: While executing the following query:

select s, i, avg(d) over (partition by s order by i) / 10.0 from over100k;

the following error is reported in the Hive log file:

NoViableAltException(15@[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN identifier ( COMMA identifier )* RPAREN ) )?])
	at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
	at org.antlr.runtime.DFA.predict(DFA.java:116)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2298)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1042)
	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:779)
	at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:30649)
	at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:28851)
	at org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(HiveParser.java:28766)
	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(HiveParser.java:28306)
	at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:28100)
	at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1213)
	at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:928)
	at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:418)
	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
	at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
	at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
FAILED: ParseException line 1:53 cannot recognize input near '/' '10.0' 'from' in selection target

Workaround: The workaround is to use the following query:

select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;

BUG-5512: Mapreduce task from Hive dynamic partitioning query is killed.

Problem: When using the Hive script to create and populate the partitioned table dynamically, the following error is reported in the TaskTracker log file:

TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. Dump of the process-tree for attempt_201305041854_0350_m_000000_0 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 30275 20786 30275 30275 (java) 2179 476 1619562496 190241 /usr/jdk64/jdk1.6.0_31/jre/bin/java ...

Workaround: The workaround is disable all the memory settings by setting value of the following perperties to -1 in the mapred-site.xml file on the JobTracker and TaskTracker host machines in your cluster:

mapred.cluster.map.memory.mb = -1
mapred.cluster.reduce.memory.mb = -1
mapred.job.map.memory.mb = -1
mapred.job.reduce.memory.mb = -1
mapred.cluster.max.map.memory.mb = -1
mapred.cluster.max.reduce.memory.mb = -1

To change these values using the UI, use the instructions provided here to update these properties.

BUG-4714: Hive Server 2 Concurrency Failure (create_index.q).

Problem: While using indexes in Hive, the following error is reported:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

BUG-2131, HIVE-5297: Partition in hive table that is of datatype ‘int’ is able to accept ‘string’ entries

Problem: Partition in hive table that is of datatype int is able to accept string entries. For example,

CREATE TABLE tab1 (id1 int,id2 string) PARTITIONED BY(month string,day int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ ;

In the above example, the partition day of datatype int can also accept string entries while data insertions.

Workaround: The workaround is to avoid adding string to int fields.