5.2. Known Issues for Hive

  • Vectorization should be disabled for tables with unsupported columns types.

    Workaround: Workaround is to use supported column types (tinyint, smallint, int, bigint, float, double, boolean, and timestamp). To disable vectorization edit hive-site.xml file and update the following property:

    hive.vectorized.execution.enabled=false
  • Mapreduce task from Hive dynamic partitioning query is killed.

    Problem: When using the Hive script to create and populate the partitioned table dynamically, the following error is reported in the TaskTracker log file:

    TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. TaskTree [pid=30275,tipID=attempt_201305041854_0350_m_000000_0] is running beyond memory-limits. Current usage : 1619562496bytes. Limit : 1610612736bytes. Killing task. Dump of the process-tree for attempt_201305041854_0350_m_000000_0 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 30275 20786 30275 30275 (java) 2179 476 1619562496 190241 /usr/jdk64/jdk1.6.0_31/jre/bin/java ...

    Workaround: The workaround is disable all the memory settings by setting value of the following perperties to -1 in the mapred-site.xml file on the JobTracker and TaskTracker host machines in your cluster:

    mapred.cluster.map.memory.mb = -1
    mapred.cluster.reduce.memory.mb = -1
    mapred.job.map.memory.mb = -1
    mapred.job.reduce.memory.mb = -1
    mapred.cluster.max.map.memory.mb = -1
    mapred.cluster.max.reduce.memory.mb = -1

    To change these values using the UI, use the instructions provided here to update these properties.

  • Problem: While executing the following query:

    select s, avg(d) over (partition by i order by f, b) from over100k;

    the following error is reported in the Hive log file:

    FAILED: SemanticException Range based Window Frame can have only 1 Sort Key

    Workaround: The workaround is to use the following query:

    select s, avg(d) over (partition by i order by f, b rows unbounded preceding) from over100k;
  • Problem: While executing the following query:

    select s, i, avg(d) over (partition by s order by i) / 10.0 from over100k;

    the following error is reported in the Hive log file:

    NoViableAltException(15@[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN identifier ( COMMA identifier )* RPAREN ) )?])
    at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
    at org.antlr.runtime.DFA.predict(DFA.java:116)
    at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectItem(HiveParser_SelectClauseParser.java:2298)
    at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectList(HiveParser_SelectClauseParser.java:1042)
    at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:779)
    at org.apache.hadoop.hive.ql.parse.HiveParser.selectClause(HiveParser.java:30649)
    at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:28851)
    at org.apache.hadoop.hive.ql.parse.HiveParser.regular_body(HiveParser.java:28766)
    at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatement(HiveParser.java:28306)
    at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:28100)
    at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1213)
    at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:928)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:418)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
    at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
    at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
    FAILED: ParseException line 1:53 cannot recognize input near '/' '10.0' 'from' in selection target
     

    Workaround: The workaround is to use the following query:

    select s, i, avg(d) / 10.0 over (partition by s order by i) from over100k;

  • Problem: While using indexes in Hive, the following error is reported:

    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
  • Problem: Partition in hive table that is of datatype int is able to accept string entries. For example,

    CREATE TABLE tab1 (id1 int,id2 string) PARTITIONED BY(month string,day int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ ;

    In the above example, the partition day of datatype int can also accept string entries while data insertions.

    Workaround: The workaround is to avoid adding string to int fields.


loading table of contents...