System Requirements for Test and Production Clusters
To run the Hortonworks Data Platform, your system must meet minimum requirements.
• Red Hat compatible systems:
• 64-bit Red Hat Enterprise Linux (RHEL) v5.x, v6.x
• 64-bit CentOS v5.x, v6.x
• SUSE systems:
• 64-bit SUSE Linux Enterprise Server 11 (SLES 11) Service pack 1
NOTE: Deploying secure Hadoop clusters is not supported for SUSE platforms.
• Ensure that you choose the appropriate number of host machines for your cluster:
• For evaluation purpose, you can either use a single machine (see: ) or can have the smallest cluster of four nodes (see: ).
• The hardware on all Hadoop host machines is assumed to be 64-bit hardware. However, the DataNodes and TaskTrackers are configured at the software level to run on 32-bit JVM in order to conserve memory usage, except in SLES which is a pure 64-bit install.
• Ensure that you have yum (RHEL) or zypper (SLES) installed on all the host machines in your cluster.
• Ensure that you install the following binaries on all the nodes in your cluster: rpm, scp, curl, wget, unzip, tar, pdsh. You can also use the auxiliary script gsPreRequisites.sh to install wget and curl on all the nodes.
• Ensure that all the nodes have JDK v 1.6 update 31 installed. Also ensure that for all the nodes, the JAVA_HOME variable points to a common location (for example: /usr/java/default). To install Java Development Kit (JDK), see the instructions provided .
• To be able to use Hive Metastore, you must deploy MySQL instance on your Hive Metastore host machine. To install MySQL instance, see the instructions provided.
• Ensure you use the fully qualified domain name (FQDN) for all the host machines. Note that only alphanumeric, hyphen (“-”), and period (“.”) characters are allowed in a valid FQDN. For more details, see:.
• All the host machines in your cluster must be configured for DNS and Reverse DNS.
• Ensure that the Network Time Protocol (NTP) is enabled for your cluster.
• In environments with no access to the Internet, ensure that you make one of your master nodes as NTP server.
• If your cluster has many machines, you can minimize load on your Internet connection and speed up the install, by providing a local copy of the HDP repository. Also, if your cluster firewall prohibits direct access to the Internet, then you must provide a local copy of the HDP repository. The instructions on configuring local mirror repository are available.
• For Red Hat compatible systems only, ensure that you have a copy of the EPEL (Extra Packages for Enterprise Linux) repository available on your local mirror repository. The instructions on configuring EPEL repository are available.
HDP requires that the Java SE Development Kit (JDK) v 1.6 update 31 or later must be installed on all the nodes in your cluster. Follow the instructions listed below to manually deploy JDK:
Step 1: Verify the existing version of Java.
Step 2 (Conditional): Uninstall the Java package if JDK version is less than v 1.6 update 31.
| grep java
yum remove java-x.xx-gcj-compat-x.x.x.x-xxjpp_xxrh
• Verify that the default Java package is uninstalled .
Step 3: Download Oracle JDK on all the nodes.
• Browse to the following location:and accept the license agreement.
• For Red Hat compatible systems, download the 64-bit JDK (jdk-6u31-linux-x64.bin) and the 32-bit JDK (jdk-6u31-linux-i586.bin).
• For SUSE systems, download the 64-bit JDK (jdk-6u31-linux-x64.bin).
Step 4: Install the JDK on all the nodes.
For Red Hat compatible systems:
• Change directory to the location where you have copied the JDK installer binary (for example: /usr/jdk64 and usr/jdk32).
• Install both 32 and 64 bit JDK.
For SUSE systems:
• Change directory to the location where you have copied the JDK installer binary (for example: /usr/jdk64 ).
• Install both 64 bit JDK.
Step 5: If JAVA_HOME is not in user's PATH, update JAVA_HOME to jdk32 bit:
If you choose to install HCatalog, you must deploy a MySQL instance on the Hive Metastore host machine. You have the following two options for deploying the instance:
Step 1: Connect to the HCatalog host machine.
Step 2: Install MySQL server.
yum install mysql-server [for RHEL and CentOS]
zypper install mysql [for SLES]
Step 3: Start the instance.
/etc/init.d/mysqld start [for RHEL and CentOS]
/etc/init.d/mysql start [for SLES]
Step 4: Remove unnecessary information from log and STDOUT.
mysqladmin -u root 2>&1 >/dev/null
Step 5: As root, use mysql or other client tool to create the “hive” user and grant it all the privileges.
'hcat'@'%' IDENTIFIED BY 'hive';
GRANT ALL PRIVILEGES ON *.* TO 'hive'@'%';
NOTE: If you are using an existing instance, the user created for HDP Installer's use must have privileges to create Hive database and tables in the Hive database.
Step 1: Populate the following MySQL related properties (mysqldbhost, mysqldbuser, mysqldbpasswd) in the gsInstaller.properties file, located in master-install-location/gsInstaller.
Step 2: Run the auxiliary script startMySQL.sh,located in master-install-location/gsInstaller.