Release Notes
Also available as:
PDF

Hive

HDP 2.5.0 provides Hive 1.2.1 as part of the General Availablility release. In addtion, HDP 2.5.0 provides Hive 2.1.0 as a Technical Preview.

The following Apache patches are included for Hive 1.2.1:

  • HIVE-4924: JDBC Support query timeout for jdbc.

  • HIVE-6113: Upgrade Datanucleus to 4.x.

  • HIVE-6535: JDBC provide an async API to execute query and fetch results.

  • HIVE-7193: Hive should support additional LDAP authentication parameters.

  • HIVE-9365: The Metastore should take port configuration from HIVE-site.xml.

  • HIVE-9605: Remove parquet nested objects from wrapper writable objects.

  • HIVE-9862: Vectorized execution corrupts timestamp values.

  • HIVE-10233: Hive on tez memory manager for grace hash join.

  • HIVE-10249: ACID show locks should show who the lock is waiting for.

  • HIVE-10485: Create md5 UDF.

  • HIVE-10631: create_table_core method has invalid update for Fast Stats.

  • HIVE-10632: Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.

  • HIVE-10639: Create SHA1 UDF.

  • HIVE-10641: Create CRC32 UDF.

  • HIVE-10644: Create SHA2 UDF.

  • HIVE-10729: Query failed when select complex columns from joined table (tez map join only).

  • HIVE-10761: Create codahale-based metrics system for Hive.

  • HIVE-10815: Let HiveMetaStoreClient Choose MetaStore Randomly.

  • HIVE-10927: Add number of HMS/HS2 connection metrics.

  • HIVE-10944: Fix HS2 for Metrics.

  • HIVE-10975: Bump the parquet version up to 1.8.1.

  • HIVE-11037: HiveOnTez make explain user level = true as default.

  • HIVE-11043: ORC split strategies should adapt based on number of files.

  • HIVE-11096: Bump the parquet version to 1.7.0.

  • HIVE-11097: MR mode query fails if one table path string starts with another's.

  • HIVE-11118: Load data query should validate file formats with destination tables.

  • HIVE-11164: WebHCat should log contents of HiveConf on startup.

  • HIVE-11388: Allow ACID Compactor components to run in multiple metastores.

  • HIVE-11401: Query on Partitioned-parquet table with where clause on a partitioned column fails as Column was not found in schema.

  • HIVE-11427: Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079.

  • HIVE-11498: HIVE Authorization v2 should not check permission for dummy entity.

  • HIVE-11512: Hive LDAP Authenticator should also support full DN in Authenticate().

  • HIVE-11550: ACID queries pollute HiveConf.

  • HIVE-11582: Remove conf variable hive.mapred.supports.subdirectories.

  • HIVE-11593: Add aes_encrypt and aes_decrypt UDFs.

  • HIVE-11695: If user have no permission to create LOCAL DIRECTORY, the HQL does not throw any exception and fail silently.

  • HIVE-11716: Reading ACID table from non-acid session should raise an error.

  • HIVE-11768: java.io.DeleteOnExitHook leaks memory on long running Hive Server2 Instances.

  • HIVE-11793: SHOW LOCKS with DbTxnManager ignores filter options.

  • HIVE-11802: Float-point numbers are displayed with different precision in Beeline/JDBC.

  • HIVE-11815: Correct the column/table names in subquery expression when creating a view.

  • HIVE-11824: Insert to local directory causes staging directory to be copied.

  • HIVE-11832: HIVE-11802 breaks compilation in JDK 8.

  • HIVE-11848: Tables in subqueries don't get locked.

  • HIVE-11866: Add framework to enable testing using LDAPServer using LDAP protocol.

  • HIVE-11891: Add basic performance logging to metastore calls.

  • HIVE-11903: Add lock metrics to HS2.

  • HIVE-11945: ORC with non-local reads may not be reusing connection to DN.

  • HIVE-11956: HOW LOCKS should indicate what acquired the lock.

  • HIVE-11984: Add HS2 open operation metrics.

  • HIVE-12007: Hive LDAP Authenticator should allow just Domain without baseDN (for AD).

  • HIVE-12008: Hive queries failing when using count(*) on column in view.

  • HIVE-12213: Investigating the test failure TestHCatClient.testTableSchemaPropagation.

  • HIVE-12224: Remove HOLD_DDLTIME.

  • HIVE-12239: Constants in hive.common.metrics.common.MetricsConstant are not final.

  • HIVE-12271: Add metrics around HS2 query execution and job submission for Hive.

  • HIVE-12279: Testcase to verify session temporary files are removed after HIVE-11768.

  • HIVE-12353: When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

  • HIVE-12366: Refactor Heartbeater logic for transaction.

  • HIVE-12435: SELECT COUNT(CASE WHEN.) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

  • HIVE-12439: CompactionTxnHandler.markCleaned() and TxnHandler.openTxns() misc improvements.

  • HIVE-12499: Add HMS metrics for number of tables and partitions.

  • HIVE-12505: Insert overwrite in same encrypted zone silently fails to remove some existing files.

  • HIVE-12525: Cleanup unused metrics in HMS.

  • HIVE-12584: Vectorized join with partition column of type char does not trim spaces.

  • HIVE-12619: Switching the field order within an array of structures causes the query to fail.

  • HIVE-12620: Misc improvement to Acid module.

  • HIVE-12628: Eliminate flakiness in TestMetrics.

  • HIVE-12634: Add command to kill an ACID transaction.

  • HIVE-12637: Make retryable SQLExceptions in TxnHandler configurable.

  • HIVE-12724: CID Major compaction fails to include the original bucket files into MR job.

  • HIVE-12733: UX improvements for HIVE-12499.

  • HIVE-12741: HS2 ShutdownHookManager holds extra of Driver instance.

  • HIVE-12751: Fix NVL explain syntax.

  • HIVE-12827: Vectorization VectorCopyRow/VectorAssignRow/VectorDeserializeRow assign needs explicit isNull[offset] modification.

  • HIVE-12837: Better memory estimation/allocation for hybrid grace hash join during hash table loading.

  • HIVE-12868: Fix empty operation-pool metrics.

  • HIVE-12875: SUMMARY-[Verify sem.getInputs() and sem.getOutputs()].

  • HIVE-12885: LDAP Authenticator improvements.

  • HIVE-12887: Handle ORC schema on read with fewer columns than file schema (after Schema Evolution changes).

  • HIVE-12893: Hive optimize.sort.dynamic.partition not enabled in case of combination of bucketing and constant propagation if subset of partition column value set.

  • HIVE-12894: Detect whether ORC is reading from ACID table correctly for Schema Evolution.

  • HIVE-12897: Improve dynamic partition loading.

  • HIVE-12907: Improve dynamic partition loading - II.

  • HIVE-12908: Improve dynamic partition loading III.

  • HIVE-12937: DbNotificationListener unable to clean up old notification events.

  • HIVE-12965: Insert overwrite local directory should preserve the overwritten directory permission.

  • HIVE-12988: Improve dynamic partition loading IV.

  • HIVE-12992: Hive on tez Bucket map join plan is incorrect.

  • HIVE-12996: Temp tables shouldn't be locked.

  • HIVE-13008: WebHcat DDL commands in secure mode NPE when default FileSystem doesn't support delegation tokens.

  • HIVE-13013: Further Improve concurrency in TxnHandler.

  • HIVE-13017: Child process of HiveServer2 fails to get delegation token from non default FileSystem.

  • HIVE-13018: "RuntimeException: Vectorization is not supported for datatype:LIST".

  • HIVE-13033: SPDO unnecessarily duplicates columns in key & value of mapper output.

  • HIVE-13040: Handle empty bucket creations more efficiently.

  • HIVE-13043: Reload function has no impact to function registry.

  • HIVE-13051: Deadline class has numerous issues.

  • HIVE-13056: Delegation tokens do not work with HS2 when used with http transport and Kerberos.

  • HIVE-13093: Hive metastore does not exit on start failure.

  • HIVE-13095: Support view column authorization.

  • HIVE-13101: NullPointerException in HiveLexer.g.

  • HIVE-13108: Operators SORT BY randomness is not safe with network partitions.

  • HIVE-13120: Propagate doAs when generating ORC splits.

  • HIVE-13125: Support masking and filtering of rows/columns.

  • HIVE-13126: Clean up MapJoinOperator properly to avoid object cache reuse with unintentional states.

  • HIVE-13169: HiveServer2 Support delegation token based connection when using http transport.

  • HIVE-13175: Disallow making external tables transactional.

  • HIVE-13178: Enhance ORC Schema Evolution to handle more standard data type conversions.

  • HIVE-13198: Authorization issues with cascading views.

  • HIVE-13200: Aggregation functions returning empty rows on partitioned columns.

  • HIVE-13201: Compaction shouldn't be allowed on non-ACID table.

  • HIVE-13209: metastore get_delegation_token fails with null ip address.

  • HIVE-13213: Make DbLockManger work for non-acid resources.

  • HIVE-13216: ORC Reader will leave file open until GC when opening a malformed ORC file.

  • HIVE-13240: Drop the hash aggregates when closing operator .

  • HIVE-13242: DISTINCT keyword is dropped by the parser for windowing.

  • HIVE-13249: Hard upper bound on number of open transactions.

  • HIVE-13261: Can not compute column stats for partition when schema evolves.

  • HIVE-13264: JDBC driver makes 2 Open Session Calls for every open session.

  • HIVE-13267: Vectorization Add SelectLikeStringColScalar for non-filter operations.

  • HIVE-13287: Add logic to estimate stats for IN operator.

  • HIVE-13291: ORC BI Split strategy should consider block size instead of file size.

  • HIVE-13294: AvroSerde leaks the connection in a case when reading schema from a URL.

  • HIVE-13295: Improvement to LDAP search queries in HS2 LDAP Authenticator.

  • HIVE-13296: Add vectorized Q test with complex types showing count(*) etc work correctly.

  • HIVE-13299: Column Names trimmed of leading and trailing spaces.

  • HIVE-13302: Direct SQL cast to date doesn't work on Oracle.

  • HIVE-13313: TABLESAMPLE ROWS feature broken for vectorization.

  • HIVE-13318: Cache the result of getTable from metastore.

  • HIVE-13326: HiveServer2 Make ZK config publishing configurable.

  • HIVE-13330: Vectorization returns NULL for empty values for varchar/string data type.

  • HIVE-13338: Differences in vectorized_casts.q output for vectorized and non-vectorized runs.

  • HIVE-13344: Port HIVE-12902 to 1.x line.

  • HIVE-13354: Add ability to specify Compaction options per table and per request.

  • HIVE-13358: Stats state is not captured correctly turn off stats optimizer for sampled table.

  • HIVE-13360: Refactoring Hive Authorization.

  • HIVE-13361: Orc concatenation should enforce the compression buffer size.

  • HIVE-13362: Commit binary file required for HIVE-13361.

  • HIVE-13373: Use most specific type for numerical constants.

  • HIVE-13381: Timestamp & date should have precedence in type hierarchy than string group.

  • HIVE-13390: HiveServer2 Add more test to ZK service discovery using MiniHS2.

  • HIVE-13392: Disable speculative execution for ACID Compactor.

  • HIVE-13393: Beeline Print help message for the --incremental option.

  • HIVE-13394: Analyze table call fails with ArrayIndexOutOfBounds exception.

  • HIVE-13394: Analyze table fails in tez on empty partitions.

  • HIVE-13395: Lost Update problem in ACID.

  • HIVE-13405: Fix Connection Leak in OrcRawRecordMerger.

  • HIVE-13418: HiveServer2 HTTP mode should support X-Forwarded-Host header for authorization/audits.

  • HIVE-13434: BaseSemanticAnalyzer.unescapeSQLString doesn't unescape \u0000 style character literals.

  • HIVE-13439: JDBC: provide a way to retrieve GUID to query Yarn ATS.

  • HIVE-13458: Heartbeater doesn't fail query when heartbeat fails.

  • HIVE-13462: HiveResultSetMetaData.getPrecision() fails for NULL columns.

  • HIVE-13463: Fix ImportSemanticAnalyzer to allow for different src/dst filesystems.

  • HIVE-13476: HS2 ShutdownHookManager holds extra of Driver instance in nested compile.

  • HIVE-13480: Add hadoop2 metrics reporter for Codahale metrics.

  • HIVE-13486: Cast the column type for column masking.

  • HIVE-13493: TransactionBatchImpl.getCurrentTxnId() ArrayIndexOutOfBounds.

  • HIVE-13510: Dynamic partitioning doesn’t work when remote metastore is used.

  • HIVE-13541: Pass view's ColumnAccessInfo to HiveAuthorizer.

  • HIVE-13553: CTE with upperCase alias throws exception.

  • HIVE-13561: HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used.

  • HIVE-13562: Enable vector bridge for all non-vectorized udfs.

  • HIVE-13563: Hive Streaming does not honor orc.compress.size and orc.stripe.size table properties.

  • HIVE-13568: UDFs for use in column-masking - includes updates for review comments.

  • HIVE-13570: Some queries with Union all fail when CBO is off.

  • HIVE-13572: Redundant setting full file status in Hive copyFiles.

  • HIVE-13592: Metastore calls map is not thread safe.

  • HIVE-13596: HS2 should be able to get UDFs on demand from metastore.

  • HIVE-13602: TPCH q16 return wrong result when CBO is on.

  • HIVE-13609: Fix UDTFs to allow local fetch task to fetch rows forwarded by GenericUDTF.close().

  • HIVE-13618: Trailing spaces in partition column will be treated differently.

  • HIVE-13619: Bucket map join plan is incorrect.

  • HIVE-13621: Compute stats in certain cases fails with NPE.

  • HIVE-13622: WriteSet tracking optimizations.

  • HIVE-13632: Hive failing on insert empty array into parquet table.

  • HIVE-13645: SUMMARY-[Beeline needs null-guard around hiveVars and hiveConfVars read.

  • HIVE-13646: Make hive.optimize.sort.dynamic.partition compatible with ACID tables.

  • HIVE-13660: Vectorizing IN expression with list of columns throws java.lang.ClassCastException ExprNodeColumnDesc cannot be cast to ExprNodeConstantDesc.

  • HIVE-13670: Improve Beeline connect/reconnect semantics.

  • HIVE-13677: org.apache.hive.com.esotericsoftware.kryo.KryoException java.lang.NullPointerException when folding CASE expression.

  • HIVE-13682: EOFException with fast hashtable.

  • HIVE-13691: No record with CQ_ID=0 found in COMPACTION_QUEUE.

  • HIVE-13693: Multi-insert query drops Filter before file output when there is a.val < > b.val

  • HIVE-13705: Insert into table removes existing data.

  • HIVE-13716: Improve dynamic partition loading V.

  • HIVE-13725: ACID Streaming API should synchronize calls when multiple threads use the same endpoint.

  • HIVE-13726: Improve dynamic partition loading VI.

  • HIVE-13729: FileSystem leaks in FileUtils.checkFileAccessWithImpersonation.

  • HIVE-13730: Avoid double spilling the same partition when memory threshold is set very low.

  • HIVE-13743: Data move codepath is broken with hive.

  • HIVE-13750: ReduceSinkDeDuplication not working with hive.optimize.sort.dynamic.partition and ACID.

  • HIVE-13753: Make metastore client thread safe in DbTxnManager.

  • HIVE-13767: Wrong type inferred in Semijoin condition leads to AssertionError to.

  • HIVE-13788: hive msck listpartitions need to make use of directSQL instead of datanucleus.

  • HIVE-13796: Fix some tests.

  • HIVE-13799: Optimize TableScanRule:checkBucketedTable.

  • HIVE-13809: Hybrid Grace Hash Join memory usage estimation didn't take into account the bloom filter size.

  • HIVE-13821: Delete statement launches 1 Mappers that takes forever to finish for 1 TB TPC-H database(transactional).

  • HIVE-13823: Remove unnecessary log line in common join operator.

  • HIVE-13831: Error pushing predicates to HBase storage handler.

  • HIVE-13833: Add an initial delay when starting the heartbeat.

  • HIVE-13837: current_timestamp() output format is different in some cases.

  • HIVE-13840: Orc split generation is reading file footers twice.

  • HIVE-13841: Orc split generation returns different strategies with cache enabled vs disabled.

  • HIVE-13849: Wrong plan for hive.optimize.sort.dynamic.partition=true.

  • HIVE-13853: Add X-XSRF-Header filter to HS2 HTTP mode and WebHCat].

  • HIVE-13856: Fetching transaction batches during ACID streaming against Hive Metastore using Oracle DB fails.

  • HIVE-13857: Insert overwrite select from some table fails throwing org.apache.hadoop.security.AccessControlException - II.

  • HIVE-13859: mask() UDF not retaining day and month field values.

  • HIVE-13867: Restore HiveAuthorizer interface changes.

  • HIVE-13901: Hive Metastore add partitions can be slow depending on filesystems.

  • HIVE-13902: [Refactor] Minimize metastore jar dependencies on task nodes.

  • HIVE-13904: Ignore case when retrieving ColumnInfo from RowResolver.

  • HIVE-13905: Optimize ColumnStatsTask constructColumnStatsFromPackedRows to have lesser number of getTable calls.

  • HIVE-13910: Select from a table is not working if used as &lt;dbname.tablename&gt;.

  • HIVE-13911: Load inpath fails throwing org.apache.hadoop.security.AccessControlException.

  • HIVE-13912: bTxnManager.commitTxn(): ORA-00918 column ambiguously defined.

  • HIVE-13922: Optimize the code path that analyzes/updates col stats.

  • HIVE-13929: org.apache.hadoop.hive.metastore.api.DataOperationType class not found error when a job is submitted by hive.

  • HIVE-13931: Add support for HikariCP and replace BoneCP usage with HikariCP.

  • HIVE-13932: Hive SMB Map Join with small set of LIMIT failed with NPE.

  • HIVE-13933: Add an option to turn off parallel file moves.

  • HIVE-13941: Return better error messages from SchemaTool].

  • HIVE-13948: Incorrect timezone handling in Writable results in wrong dates in queries.

  • HIVE-13954: Any query on Parquet table throws SLF4J:Failed to load class org.slf4j.impl.StaticLoggerBinder - resulting unwanted WARN and INFO messages on stdout.

  • HIVE-13957: Vectorized IN is inconsistent with non-vectorized (at least for decimal in (string)).

  • HIVE-13961: Major compaction fails to include the original bucket files if there's no delta directory.

  • HIVE-13972: Resolve class dependency issue introduced by HIVE-13354.

  • HIVE-13984: Use multi-threaded approach to listing files for msck.

  • HIVE-13985: ORC improvements for reducing the file system calls in task side.

  • HIVE-13997: Insert overwrite directory doesn't overwrite existing files.

  • HIVE-14006: Hive query with UNION ALL fails with ArrayIndexOutOfBoundsException.

  • HIVE-14010: parquet-logging.properties from HIVE_CONF_DIR should be used when available.

  • HIVE-14014: Zero length file is being created for empty bucket in tez mode (II).

  • HIVE-14018: Make IN clause row selectivity estimation customizable.

  • HIVE-14022: Left semi join should throw SemanticException if where clause contains columnname from right table.

  • HIVE-14038: Miscellaneous acid improvements.

  • HIVE-14045: (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method.

  • HIVE-14054: TestHiveMetaStoreChecker fails on master.

  • HIVE-14070: hive.tez.exec.print.summary=true returns wrong performance numbers on HS2.

  • HIVE-14073: Update config whiltelist for SQL std authorization.

  • HIVE-14080: hive.metastore.schema.verification should check for schema compatibility.

  • HIVE-14084: ORC table data load failure when metadata size is very large.

  • HIVE-14114: Ensure RecordWriter in streaming API is using the same UserGroupInformation as StreamingConnection.

  • HIVE-14126: With ranger enabled, partitioned columns is returned first when you execute select star.

  • HIVE-14132: Don't fail config validation for removed configs.

  • HIVE-14147: Hive PPD might remove predicates when they are defined as a simple expression e.g. WHERE 'a'.

  • HIVE-14192: False positive error due to thrift.

The following Apache patches are included for Hive 2.1.0:

  • HIVE-10815: Let HiveMetaStoreClient Choose MetaStore Randomly.

  • HIVE-13391: add an option to LLAP to use keytab to authenticate to read data.

  • HIVE-13443: LLAP: signing for the second state of submit (the event).

  • HIVE-13617: LLAP: support non-vectorized execution in IO.

  • HIVE-13675: LLAP: add HMAC signatures to LLAPIF splits.

  • HIVE-13731: LLAP: return LLAP token with the splits.

  • HIVE-13759: LlapTaskUmbilicalExternalClient should be closed by the record reader.

  • HIVE-13771: LLAPIF: generate app ID.

  • HIVE-13827: LLAPIF: authentication on the output channel.

  • HIVE-13909: upgrade ACLs in LLAP registry when the cluster is upgraded to secure.

  • HIVE-13930: upgrade Hive to latest Hadoop version.

  • HIVE-13931: Update connection pool usage from bonecp to hikaricp.

  • HIVE-13956: LLAP: external client output is writing to channel before it is writable again.

  • HIVE-13986: LLAP: kill Tez AM on token errors from plugin.

  • HIVE-14023: LLAP: Make the Hive query id available in ContainerRunner.

  • HIVE-14045: (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method.

  • HIVE-14072: QueryIds reused across different queries.

  • HIVE-14078: LLAP input split should get task attempt number from conf if available.

  • HIVE-14080: hive.metastore.schema.verification should check for schema compatiblity.

  • HIVE-14091: some errors are not propagated to LLAP external clients.

  • HIVE-14093: LLAP output format connection should wait for all writes to finish before closing channel.

  • HIVE-14119: LLAP external recordreader not returning non-ascii string properly.

  • HIVE-14136: LLAP ZK SecretManager should resolve _HOST in principal.

  • HIVE-14180: Disable LlapZooKeeperRegistry ZK auth setup for external clients.

  • HIVE-14182: Revert HIVE-13084/HIVE-13924/HIVE-14034.

  • HIVE-14219: LLAP external client on secure cluster: Protocol interface org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol is not known.

  • HIVE-14224: LLAP: Rename query specific log files when a query completes execution.

  • HIVE-14230: Hadoop23Shims.cloneUgi() doesn't add credentials from original UGI.

  • HIVE-14245: NoClassDefFoundError when starting LLAP daemon.

  • HIVE-14349: Vectorization: LIKE should anchor the regexes.

  • HIVE-14364: Update timeouts for llap comparator tests.

  • HIVE-14392: llap daemons should try using YARN local dirs, if available.

  • HIVE-14403: LLAP node specific preemption will only preempt once on a node per AM.

  • HIVE-14421: FS.deleteOnExit holds references to _tmp_space.db files.

  • HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException Error: , expected at the end of 'decimal(9'" after enabling hive.optimize.skewjoin and with MR engine.

  • HIVE-14439: LlapTaskScheduler should try scheduling tasks when a node is disabled.