# Copyright 2012, Hortonworks Inc. All rights reserved.# # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # http://www.apache.org/licenses/LICENSE-2.0 # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # RELEASE NOTES: Hortonworks Data Platform for Windows 1.1 (Developer Preview) powered by Apache Hadoop Product Version: HDP-Win-Alpha v1.1.0.1 ============================================ This release of Hortonworks Data Platform (HDP) deploys the following Hadoop-related components: * Apache Hadoop 1.0.3 * Apache Pig 0.9.3 * Apache HCatalog 0.4.0 * Apache Hive 0.9.0 Improvements and bug fixes ============================================= Apache Hadoop ------------- * HADOOP-8872: Fixed issue while invoking FileSystem.length() method on a Windows machine using JDK 6.x. * MAPREDUCE-4564: Fixed shell timeout mechanism. This fix enables successful termination of those processes that are spawned by Winutils. * MAPREDUCE-4561: Added support for node health scripts on Windows. * MAPREDUCE-4597: Fixed intermittent failures for TestKillSubProcesses. * HADOOP-8739: Fixed command line parsing on Windows. * HADOOP-8664: Fixed the Hadoop streaming job issue that required the user to provide full path to commands. * HDFS-3766: Fixed TestStorageRestore on Windows. * HADOOP-8634: Fixed the errors caused when FileSystem deleteonExit method is invoked. * HDFS-3763: Fixed the TestNameNodeMXBean failures on Windows. * MAPREDUCE-4510: Fixed redundant checks and logging of getconf on Windows. * HADOOP-8731: Added public distributed cache support for Windows. This fixes the failures for TestTrackerDistributedCacheManager. * HADOOP-8763: Fixed issues caused when setting group owner on Windows. * HADOOP-8694: Added symlink support on Windows. * HADOOP-8732: Fixed test failures caused due to incorrect process serialiation on Windows. * HDFS-3564: Added enhancements to the block placement policy. This enhancement enables a pluggable placement policy and provides a new API that enables moving blocks for balancing. It also enables the placement policy to decide the number of racks and provides ability to extend the policy. * HDFS-3566: Add AzureBlockPlacementPolicy to handle fault and upgrade domains in Azure. This policy spreads replicas across both the fault and the upgrade domain to ensure practically zero data loss. * HADOOP-8457: Fixed the file ownership issue for users in the Administrators groups on Windows. * MAPREDUCE-4374: Added support for configurable environment for child map/reduce tasks on Windows. * HADOOP-8453: Added unit tests for Winutils program. Winutils is the Windows console program that emulates the Linux command line utilities used by Hadoop. * HDFS-385: Added new experimental API BlockPlacementPolicy allows investigating alternate rules for locating block replicas. * HADOOP-8899: Fixed issues caused because of Classpath exceeding the maximum operating system (OS) limit. * MAPREDUCE-1806: Fixed issues with CombineFileInputFormat. * HADOOP-8935: Improved Winutils to handle the failures caused for for the winutils ls command. * HADOOP-6496: Fixed the HTTPServer issue that caused incorrect rendering of the web interface for HBase. * HADOOP-7827: Fixed issue with JSP pages for web interfaces. * HADOOP-8903: Added support for HADOOP_USER_CLASSPATH_FIRST environment variable in the hadoop.cmd file. * HADOOP-8880: Fixed Hive test failures caused because of missing Jersey JAR files in the POM template. * HADOOP-8733: Fixed the failures caused by the dependencies in the test.sh script file. * MAPREDUCE-4400: Fixed performance regression for small jobs and workflows. * HADOOP-8734: Fixed LocalJobRunner to support private distributed cache. * HDFS-3833: Fixed TestDFSShell failures on Windows caused due to concurrent file read/write. * MAPREDUCE-4598: Added support for node health scripts on Windows. * HADOOP-8657: Fixed TestCLI not to hardcode the file length. * HDFS-3163: Fixed failures for TestHDFSCLI.testAll caused if the user name is not in lowercase. * HADOOP-8618: Fixed build failures caused due to Hadoop v1.0.3 merge. * HADOOP-8544: Moved an assertion location in winutils chmod. * HADOOP-7389: Fixed test failures caused when tests use the TestingGroups. * HADOOP-8414: Fixed issues caused by localhost resolving to incorrect address on Windows. * MAPREDUCE-4368: Fixed TaskRunner to handle the event when java.library.path contains a quoted path with embedded spaces on Windows. * MAPREDUCE-4369: Fixed streaming job failures with WindowsResourceCalculatorPlugin. * MAPREDUCE-4332: Fixed command length abort issues on Windows. * HADOOP-8487: Fixed the HDFS tests to use correct test paths. * HADOOP-8534: Fixed those tests that leaved the configuration files open causing failure on Windows. * HADOOP-8486: Fixed the resource leak caused because of open file handles for SequenceFile. * MAPREDUCE-4203: Added an implementation of the process tree for Windows. * HADOOP-8454: Fixed the chmod bug in Winutils. * HADOOP-8409: Fixed TestCommandLineJobSubmission and TestGenericOptionsParser to work for Windows. * MAPREDUCE-4260: Added support to use JobObject to spawn tasks on Windows. * MAPREDUCE-4321: Fixed failures for DefaultTaskController on Windows. * HADOOP-8424: Fixed Classpath issues causing failures for web user interface on Windows. * HDFS-3424: Fixed TestDatanodeBlockScanner and TestReplication failures on Windows. * HADOOP-8374:Improved support for hard link manipulation on Windows. * HADOOP-8440: Fixed failures for HarFileSystem.decodeHarURI. * HADOOP-8411: Fixed TestStorageDirecotyFailure, TestTaskLogsTruncater, TestWebHdfsUrl and TestSecurityUtil failures on Windows. * HADOOP-8235: Added support file permissions and ownership on Windows for RawLocalFileSystem. * MAPREDUCE-4204: Improved ProcfsBasedProcessTree to enable the resource collection object to be pluggable. * MAPREDUCE-4201: Fixed issues related to obtaining PIDs on Windows. * HADOOP-8234: Enabled user group mappings on Windows. * HADOOP-8223: Applied initial patch for branch-1-win. Apache HCatalog --------------- * HCATALOG-512: Fixed HCatalog unit tests on Windows. * HCATALOG-514: Fixed HCatalog python scripts in the package build for Windows. Apache Hive ------------ * HIVE-3025: Fixed Hive archive command for Hadoop v 0.22 and 0.23. * HIVE-3448: Fixed failures for the skewjoin.q testcase on Windows. * HIVE-3441: Fixed failures caused due to the partition column strings that are not accepted in Windows file names. * HIVE-3436: Fixed the script_pipe.q failures on Windows. * HIVE-3483: Fixed issues with the joins using partitioned table on Windows. * HIVE-3317: Fixed TestDocToUnix unit tests on Windows. * HIVE-3320: Fixed test case failures caused by incorrect handling of CRLF line endings on Windows. * HIVE-3319: Fixed path related issues that caused the unit test failures for Windows. * HIVE-3327: Changed the absolute path to relative path to enable running “/bin/cat” on Windows successfully. * HIVE-3479: Fixed issues negative unit tests. * HIVE-3494: Fixed issues with the JDBC test case failures on Windows. * HIVE-3480: Fixed file handle leaks in Symbolic and symlink related input formats. Apache Pig ----------- * PIG-2958: Fixed false positive errors in TestPigRunner caused because the Pig tests did not have any logger associated with them. * PIG-2957: Fixed failures for TetsScriptUDF test. * PIG-2960: Increased the timeout for unit tests on Windows. * PIG-2943: Improved DevTests and Windows checks to ensure that the Util.Windows method is used. * PIG-2942: Fixed false failures for DevTests and TestLoad tests. * PIG-2941: Added consistency in chaining the Ivy resolvers and also added fallback mechanism. * PIG-2953: Added support for "which" utility on Windows. * PIG-2956: Fixed issues with invalid cache specification for some streaming statement. * PIG-2954: Fixed test failures caused due to the dependency on bash. * PIG-2959: Fixes for pig.cmd to run on Windows. * PIG-2801: Fixes for grunt "sh" command. * PIG-2800: Fixes for pig.additional.jars path separator issues. * PIG-2799: Updated Pig streaming interface to run correctly on Windows without Cygwin support. * PIG-2798: Fixed issues with Pig streaming tests on Windows. * PIG-2797: Fixed the Pig tests to use Util.generateURI method. * PIG-2796: Fixed issues with invalid path names on HDFS caused because the Pig tests use local temporary paths. * PIG-2795: Added support to handle paths generated on Windows. * PIG-2794: Added utilities to facilitate testing on Windows platform. * PIG-2793: Improved Pig to work on Windows platform without Cygwin support. Known Issues ==================== * Running MR jobs through pipes is currently not supported on Windows. * Non-default compression codecs that are based on zlib or snappy are currently not supported on Windows. * It is possible to encounter the following exception while starting the Hive command line interface (CLI). "FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: DERBY SQL error: SQLCODE: -1, SQLSTATE: XJ041, SQLERRMC: Failed to create database 'metastore_db', see the next exception for details. ::SQLSTATE: XBM0JDirectory C:\Hadoop\hive-0.9.0\bin\metastore_db already exists." This typically happens when the user attempts to install and uninstall HDP repeatedly. In such cases, the directories for Hive might not get deleted properly on uninstall. The workaround is to either manually delete the metastore_db directory (C:\Hadoop\hive-0.9.0\bin\metastore_db) or to uninstall HDP, delete all the files in the Hadoop directory (C:\Hadoop), and install HDP again. * HDP creates some files in the HDFS directory (C:\hdfs) that are not deleted on uninstall. This issue is observed when hadoop.tmp.dir is not defined to point to the "C:\hadoop" location. There is no impact on the deployment of your cluster. However, it is recommended to delete the files in the HDFS directory (C:\hdfs) manually after you uninstall HDP.