HDP-2.4.3 Release Notes
Also available as:
PDF

Hadoop

HDP 2.4.3 provides the following Apache patches:

  • HADOOP-7817: RawLocalFileSystem.append() should give FSDataOutputStream with accurate .getPos().

  • HADOOP-9477: Add posixGroups support for LDAP groups mapping service.

  • HADOOP-10965: Print fully qualified path in CommandWithDestination error messages.

  • HADOOP-11361: Fix a race condition in MetricsSourceAdapter.updateJmxCache.

  • HADOOP-11404: Clarify the "expected client Kerberos principal is null" authorization message.

  • HADOOP-11491: HarFs incorrectly declared as requiring an authority.

  • HADOOP-12001: Fixed LdapGroupsMapping to include configurable Posix UID and GID attributes during the search.

  • HADOOP-12076: Incomplete Cache Mechanism in CredentialProvider AP.

  • HADOOP-12189: Improve CallQueueManager#swapQueue to make queue elements drop nearly impossible.

  • HADOOP-12291: Add support for nested groups in LdapGroupsMapping.

  • HADOOP-12296: When setnetgrent returns 0 in Linux, exception should be thrown.

  • HADOOP-12345: Pad hostname correctly in CredentialsSys.java.

  • HADOOP-12469: distcp should not ignore the ignoreFailures option.

  • HADOOP-12472: Make GenericTestUtils.assertExceptionContains robust.

  • HADOOP-12568: Update core-default.xml to describe posixGroups support.

  • HADOOP-12604: Exception may be swallowed in KMSClientProvide.

  • HADOOP-12622: Improve the loggings in RetryPolicies and RetryInvocationHandler.

  • HADOOP-12636: Prevent ServiceLoader failure init for unused FileSystems.

  • HADOOP-12659: Incorrect usage of config parameters in token manager of KMS.

  • HADOOP-12672: RPC timeout should not override IPC ping interval.

  • HADOOP-12716: KerberosAuthenticator#doSpnegoSequence use incorrect class to determine isKeyTab in JDK8.

  • HADOOP-12751: While using kerberos Hadoop incorrectly assumes names with '@' to be non-simple.

  • HADOOP-12772: NetworkTopologyWithNodeGroup.getNodeGroup() can loop infinitely for invalid 'loc' values.

  • HADOOP-12782: Faster LDAP group name resolution with ActiveDirectory.

  • HADOOP-12793: Write a new group mapping service guide.

  • HADOOP-12810: FileSystem#listLocatedStatus causes unnecessary RPC call.

  • HADOOP-12828: Print user when services are started.

  • HADOOP-12831: LocalFS/FSOutputSummer NPEs in constructor if bytes per checksum set to 0.

  • HADOOP-12847: Hadoop daemonlog should support https and SPNEGO for Kerberized cluster.

  • HADOOP-12893: Update LICENSE.txt and NOTICE.txt.

  • HADOOP-12895: SSLFactory#createSSLSocketFactory exception message is wrong.

  • HADOOP-12901: Add warning log when KMSClientProvider cannot create a connection to the KMS server.

  • HADOOP-12906: AuthenticatedURL should convert a 404/Not Found into an FileNotFoundException.

  • HADOOP-12916: Allow RPC scheduler/callqueue backoff using response times.

  • HADOOP-12962: KMS key names are incorrectly encoded when creating key.

  • HADOOP-12985: Support MetricsSource interface for DecayRpcScheduler Metrics.

  • HADOOP-12994: Specify PositionedReadable, add contract tests, fix problems.

  • HADOOP-13030: Handle special characters in passwords in KMS startup script.

  • HADOOP-13042: Restore lost leveldbjni LICENSE and NOTICE changes.

  • HADOOP-13052: ChecksumFileSystem mishandles crc file permissions.

  • HADOOP-13105: Support timeouts in LDAP queries in LdapGroupsMapping.

  • HADOOP-13155: Implement TokenRenewer to renew and cancel delegation tokens in KMS.

  • HADOOP-13159: Fix potential NPE in Metrics2 source for DecayRpcScheduler.

  • HADOOP-13179: GenericOptionsParser is not thread-safe because commons-cli OptionBuilder is not thread-safe.

  • HADOOP-13197: Add non-decayed call metrics for DecayRpcScheduler.

  • HADOOP-13251: Authenticate with Kerberos credentials when renewing KMS delegation token.

  • HADOOP-13255: KMSClientProvider should check and renew tgt when doing delegation token operations.

  • HADOOP-13263: Reload cached groups in background after expiry.

  • HADOOP-13270: BZip2CompressionInputStream finds the same compression marker twice in corner case, causing duplicate data blocks.

  • HADOOP-13285: DecayRpcScheduler MXBean should only report decayed CallVolumeSummary.

  • HADOOP-13350: Additional fix to LICENSE and NOTICE.

  • HADOOP-13351: TestDFSClientSocketSize buffer size tests are flaky.

  • HADOOP-13362: DefaultMetricsSystem leaks the source name when a source unregisters.

  • HADOOP-13403: AzureNativeFileSystem rename/delete performance improvements.

  • HADOOP-13459: Hadoop-Azure runs several test cases repeatedly, causing unnecessarily long running time.

  • HADOOP-13513: Java 1.7 support for org.apache.hadoop.fs.azure test cases.

  • HDFS-2043: TestHFlush failing intermittently.

  • HDFS-6407: Add sorting and pagination in the datanode tab of the NN Web UI.

  • HDFS-7166: SbNN Web UI shows #Under replicated blocks and #pending deletion blocks.

  • HDFS-7314: When the DFSClient lease cannot be renewed, abort open-for-write files rather than the entire DFSClien.

  • HDFS-7452: Skip StandbyException log for getCorruptFiles().

  • HDFS-7597: DelegationTokenIdentifier should cache the TokenIdentifier to UGI mapping.

  • HDFS-7978: Add LOG.isDebugEnabled() guard for some LOG.debug().

  • HDFS-8546: Use try with resources in DataStorage and Storage.

  • HDFS-8578: On upgrade, Datanode should process all storage/data dirs in parallel.

  • HDFS-8581: ContentSummary on / skips further counts on yielding loc.

  • HDFS-8758: Implement the continuation library in libhdfspp.

  • HDFS-8816: Improve visualization for the Datanode tab in the NN UI.

  • HDFS-8831: Trash Support for deletion in HDFS encryption zone.

  • HDFS-8844: TestHDFSCLI does not cleanup the test directory.

  • HDFS-8845: DiskChecker should not traverse the entire tree.

  • HDFS-8964: When validating the edit log, do not read at or beyond the file offset that is being written.

  • HDFS-9259: Make SO_SNDBUF size configurable at DFSClient side for hdfs write scenario.

  • HDFS-9317: Document fsck -blockId and -storagepolicy options.

  • HDFS-9365: Balancer does not work with the HDFS-6376 HA setup.

  • HDFS-9395: Make HDFS audit logging consistent.

  • HDFS-9412: getBlocks occupies FSLock and takes too long to complete.

  • HDFS-9415: Document dfs.cluster.administrators and dfs.permissions.superusergroup.

  • HDFS-9470: Encryption zone on root not loaded from fsimage after NN restart.

  • HDFS-9516: Truncate file fails with data dirs on multiple disks.

  • HDFS-9521: TransferFsImage.receiveFile should account and log separate times for image download and fsync to disk.

  • HDFS-9530: ReservedSpace is not cleared for abandoned Block.

  • HDFS-9533: seen_txid in the shared edits directory is modified during bootstrapping.

  • HDFS-9555: LazyPersistFileScrubber should still sleep if there are errors in the clear progress.

  • HDFS-9566: Remove expensive 'BlocksMap#getStorages(Block b, final DatanodeStorage.State state)' method.

  • HDFS-9569: Log the name of the fsimage being loaded for better supportability.

  • HDFS-9584: NPE in distcp when ssl configuration file does not exist in class path.

  • HDFS-9608: Disk IO imbalance in HDFS with heterogeneous storages.

  • HDFS-9624: DataNode start slowly due to the initial DU command operations.

  • HDFS-9629: Update the footer of Web UI to show year 201.

  • HDFS-9634: WebHDFS client side exceptions don't provide enough details.

  • HDFS-9648: TestStartup.testImageChecksum is broken by HDFS-9569s message change.

  • HDFS-9654: Code refactoring for HDFS-8578.

  • HDFS-9670: DistCp throws NPE when source is root.

  • HDFS-9690: ClientProtocol.addBlock is not idempotent after HDFS-8071.

  • HDFS-9713: DataXceiver#copyBlock should return if block is pinned.

  • HDFS-9715: Check storage ID uniqueness on datanode startup.

  • HDFS-9721: Allow Delimited PB OIV tool to run upon fsimage that contains INodeReference.

  • HDFS-9730: Storage ID update does not happen when there is a layout change.

  • HDFS-9748: Avoid duplication in pendingReplications when addExpectedReplicasToPending is called twice.

  • HDFS-9799: Reimplement getCurrentTrashDir to remove incompatibility.

  • HDFS-9812: Streamer threads leak if failure happens when closing DFSOutputStream.

  • HDFS-9844: Correct path creation in getTrashRoot to handle root dir.

  • HDFS-9871: "Bytes Being Moved" -ve(-1 B) when cluster was already balanced.

  • HDFS-9874: Long living DataXceiver threads cause volume shutdown to block.

  • HDFS-9881: DistributedFileSystem#getTrashRoot returns incorrect path for encryption zones.

  • HDFS-9882: Add heartbeatsTotal in Datanode metrics.

  • HDFS-9902: Support different values of dfs.datanode.du.reserved per storage type.

  • HDFS-9905: WebHdfsFileSystem#runWithRetry should display original stack trace on error.

  • HDFS-9941: Do not log StandbyException on NN, other minor logging fixed.

  • HDFS-10182: Hedged read might overwrite user's buf.

  • HDFS-10186: DirectoryScanner Improve logs by adding full path of both actual and expected block directories. .

  • HDFS-10189: PacketResponder#toString should include the downstreams for PacketResponderType.HAS_DOWNSTREAM_IN_PIPELINE.

  • HDFS-10217: Show 'blockScheduled' tooltip in datanodes table.

  • HDFS-10228: TestHDFSCLI fails.

  • HDFS-10235: Last contact for Live Nodes should be relative time.

  • HDFS-10253: Fix TestRefreshCallQueue failure.

  • HDFS-10264: Logging improvements in FSImageFormatProtobuf.Saver.

  • HDFS-10271: Extra bytes are getting released from reservedSpace for append.

  • HDFS-10277: PositionedReadable test testReadFullyZeroByteFile failing in HDFS.

  • HDFS-10291: TestShortCircuitLocalRead failing.

  • HDFS-10319: Balancer should not try to pair storages with different types.

  • HDFS-10324: Trash directory in an encryption zone should be pre-created with correct permissions.

  • HDFS-10329: Bad initialisation of StringBuffer in RequestHedgingProxyProvider.

  • HDFS-10344: DistributedFileSystem#getTrashRoots should skip encryption zone that does not have .Trash.

  • HDFS-10360: DataNode may format directory and lose blocks if current/VERSION is missing.

  • HDFS-10417: Improve error message from checkBlockLocalPathAccess.

  • HDFS-10440: Improve DataNode web UI.

  • HDFS-10458: getFileEncryptionInfo should return quickly for non-encrypted cluster.

  • HDFS-10468: HDFS read ends up ignoring an interrupt.

  • HDFS-10469: Add number of active xceivers to datanode metrics.

  • HDFS-10481: HTTPFS server should correctly impersonate as end user to open file.

  • HDFS-10493: Add links to datanode web UI in namenode datanodes page.

  • HDFS-10508: DFSInputStream should set thread's interrupt status after catching InterruptException from sleep.

  • HDFS-10556: DistCpOptions should be validated automatically.

  • HDFS-10617: PendingReconstructionBlocks.size() should be synchronized.

  • HDFS-10642: TestLazyPersistReplicaRecovery#testDnRestartWithSavedReplicas fails intermittently.

  • HDFS-10688: BPServiceActor may run into a tight loop for sending block report when hitting IOException.

  • MAPREDUCE-6543: Migrate MR Client test cases part 2.

  • MAPREDUCE-6637: Testcase Failure TestFileInputFormat.testSplitLocationInfo.

  • YARN-2104: Scheduler queue filter failed to work because index of queue column changed.

  • YARN-4325: Nodemanager log handlers fail to send finished/failed events in some cases.

  • YARN-4983: JVM and UGI metrics disappear after RM transitioned to standby mod.

  • YARN-5076: YARN web interfaces lack XFS (Cross-Frame Script) protection.

  • YARN-5112: Excessive log warnings for directory permission issue on NM recovery.

HDP 2.4.0 provided the following Apache patches:

  • HADOOP-10406: TestIPC.testIpcWithReaderQueuing may fail.

  • HADOOP-12551: Introduce FileNotFoundException for WASB FileSystem API.

  • HADOOP-12608: Fix exception message in WASB when connecting with anonymous credential.

  • HADOOP-12678: Handle empty rename pending metadata file during atomic rename in redo path.

  • HDFS-8729: Fix TestFileTruncate#testTruncateWithDataNodesRestartImmediately which occasionally failed.

  • HDFS-9358: TestNodeCount#testNodeCount timed out.

  • HDFS-9406: FSImage may get corrupted after deleting snapshot.

  • HDFS-9672: o.a.h.hdfs.TestLeaseRecovery2 fails intermittently.

  • MAPREDUCE-6566: Add retry support to mapreduce CLI tool.

  • MAPREDUCE-6618: YarnClientProtocolProvider leaking the YarnClient thread.

  • MAPREDUCE-6621: Memory Link in JobClient#submitJobInternal().

  • YARN-3480: Remove attempts that are beyond max-attempt limit from state store.

  • YARN-4309: Add container launch related debug information to container logs when a container fails.

  • YARN-4497: RM might fail to restart when recovering apps whose attempts are missing.

  • YARN-4565: Sometimes when sizeBasedWeight FairOrderingPolicy is enabled, under stress appears that cluster is virtually in deadlock.

  • YARN-4584: RM startup failure when AM attempts greater than max-attempts.

  • YARN-4625: ApplicationSubmissionContext and ApplicationSubmissionContextInfo more consistent.