Installing HDP Manually
Also available as:
PDF
loading table of contents...

Contents

1. Getting Ready to Install
Meet Minimum System Requirements
Hardware recommendations
Operating System Requirements
Software Requirements
JDK Requirements
Metastore Database Requirements
Virtualization and Cloud Platforms
Configure the Remote Repositories
Decide on Deployment Type
Collect Information
Prepare the Environment
Enable NTP on the Cluster
Check DNS
Disable SELinux
Disable IPTables
Download Companion Files
Define Environment Parameters
[Optional] Create System Users and Groups
Determine HDP Memory Configuration Settings
Running the HDP Utility Script
Manually Calculating YARN and MapReduce Memory Configuration Settings
Configuring NameNode Heap Size
Allocate Adequate Log Space for HDP
2. Installing HDFS and YARN
Set Default File and Directory Permissions
Install the Hadoop Packages
Install Compression Libraries
Install Snappy
Install LZO
Create Directories
Create the NameNode Directories
Create the SecondaryNameNode Directories
Create DataNode and YARN NodeManager Local Directories
Create the Log and PID Directories
Symlink Directories with hdp-select
3. Installing Apache ZooKeeper
Install the ZooKeeper Package
Securing ZooKeeper with Kerberos (optional)
Set Directories and Permissions
Set Up the Configuration Files
Start ZooKeeper
4. Setting Up the Hadoop Configuration
5. Validating the Core Hadoop Installation
Format and Start HDFS
Smoke Test HDFS
Configure YARN and MapReduce
Start YARN
Start MapReduce JobHistory Server
Smoke Test MapReduce
6. Installing Apache HBase
Install the HBase RPMs
Set Directories and Permissions
Set Up the Configuration Files
Validate the Installation
Starting the HBase Thrift and REST Servers
7. Installing Apache Phoenix
Configuring HBase for Phoenix
Configuring Phoenix to Run in a Secure Cluster
Smoke Testing Phoenix
Troubleshooting Phoenix
8. Installing and Configuring Apache Tez
Prerequisites
Install the Tez RPM
Configure Tez
Validate the Tez Installation
Troubleshooting
9. Installing Apache Hive and Apache HCatalog
Installing the Hive-HCatalog RPM
Setting Directories and Permissions
Setting Up the Hive/HCatalog Configuration Files
HDP-Utility script
Configure Hive and HiveServer2 for Tez
Setting Up RDBMS for Use with the Hive Metastore
Creating Directories on HDFS
Validating the Installation
Enabling Tez for Hive Queries
Disabling Tez for Hive Queries
Configuring Tez with the Capacity Scheduler
Validating Hive-on-Tez Installation
10. Installing Apache Pig
Install the Pig RPMs
Set Up Configuration Files
Validate the Installation
11. Installing Apache WebHCat
Install the WebHCat RPMs
Upload the Pig, Hive and Sqoop tarballs to HDFS
Set Directories and Permissions
Modify WebHCat Configuration Files
Set Up HDFS User and Prepare WebHCat Directories
Validate the Installation
12. Installing Apache Oozie
Install the Oozie RPMs
Set Directories and Permissions
Set Up the Oozie Configuration Files
For Derby:
For MySQL:
For PostgreSQL
For Oracle:
Configure Your Database for Oozie
Validate the Installation
13. Installing Apache Ranger
Installation Prerequisites
Manual Installation
Installing Policy Manager
Install the Ranger Policy Manager
Install the Ranger Policy Administration Service
Start the Ranger Policy Administration Service
Installing UserSync
Installing Ranger Plug-ins
Installing the Ranger HDFS Plug-in
Installing the Ranger HBase Plug-in
Installing the Ranger Hive Plug-in
Installing the Ranger Knox Plug-in
Installing the Ranger Storm Plug-in
Verifying the Installation
14. Installing Hue
Prerequisites
Configure HDP
Install Hue
Configure Hue
Start Hue
Configuring Hue for an External Database
Using Hue with Oracle
Using Hue with MySQL
Using Hue with PostgreSQL
15. Installing Apache Sqoop
Install the Sqoop RPMs
Set Up the Sqoop Configuration
Validate the Installation
16. Installing Apache Mahout
17. Installing and Configuring Apache Flume
Understanding Flume
Installing Flume
Configuring Flume
Starting Flume
HDP and Flume
A Simple Example
18. Installing and Configuring Apache Storm
Install the Storm RPMs
Configure Storm
Configure a Process Controller
(Optional) Configure Kerberos Authentication for Storm
(Optional) Configuring Authorization for Storm
Validate the Installation
19. Installing and Configuring Apache Spark
Spark Prerequisites
Installing Spark
Configuring Spark
Validating Spark
20. Installing and Configuring Apache Kafka
Install Kafka
Configure Kafka
Validate Kafka
21. Installing Apache Accumulo
Install the Accumulo RPM
Configure Accumulo
Validate Accumulo
22. Installing Apache Falcon
Install the Falcon RPM
Configuring Proxy Settings
Configuring Falcon Entities
Configuring Oozie for Falcon
Configuring Hive for Falcon
Configuring for Secure Clusters
Validate Falcon
23. Installing Apache Knox
Install the Knox RPMs on the Knox server
Set up and Validate the Knox Gateway Installation
24. Installing Ganglia (Deprecated)
Install the Ganglia RPMs
Install the Configuration Files
Extract the Ganglia Configuration Files
Copy the Configuration Files
Set Up Ganglia Hosts
Set Up Configurations
Set Up Hadoop Metrics
Validate the Installation
25. Installing Nagios (Deprecated)
Install the Nagios RPMs
Install the Configuration Files
Extract the Nagios Configuration Files
Create the Nagios Directories
Copy the Configuration Files
Set the Nagios Admin Password
Set the Nagios Admin Email Contact Address
Register the Hadoop Configuration Files
Set Hosts
Set Host Groups
Set Services
Set Status
Add Templeton Status and Check TCP Wrapper Commands
Validate the Installation
26. Installing Apache Slider
27. Setting Up Security for Manual Installs
Preparing Kerberos
Kerberos Overview
Installing and Configuring the KDC
Creating the Database and Setting Up the First Administrator
Creating Service Principals and Keytab Files for HDP
Configuring HDP
Configuration Overview
Creating Mappings Between Principals and UNIX Usernames
Examples
Adding Security Information to Configuration Files
Configuring Hue
Setting up One-Way Trust with Active Directory
Configure Kerberos Hadoop Realm on the AD DC
Configure the AD Domain on the KDC and Hadoop Cluster Hosts
28. Uninstalling HDP

List of Tables

1.1. Define Directories for Core Hadoop
1.2. Define Directories for Ecosystem Components
1.3. Define Users and Groups for Systems
1.4. Typical System Users and Groups
1.5. yarn-utils.py Options
1.6. Reserved Memory Recommendations
1.7. Recommended Values
1.8. YARN and MapReduce Configuration Setting Value Calculations
1.9. Example Value Calculations
1.10. Example Value Calculations
1.11. NameNode Heap Size Settings
8.1. Tez Configuration Parameters
9.1. Hive Configuration Parameters
11.1. Hadoop core-site.xml File Properties
13.1. install.properties Entries
13.2. Properties to Update in the install.properties File
13.3. HDFS-Related Properties to Edit in the install.properties File
13.4. HBase Properties to Edit in the install.properties File
13.5. Hive-Related Properties to Edit in the install.properties File
13.6. Knox-Related Properties to Edit in the install.properties File
13.7. Storm-Related Properties to Edit in the install.properties File
14.1. Hue-Supported Browsers
14.2. Hue Dependencies on HDP Components
14.3. Variables to Configure HDFS Cluster
14.4. Variables to Configure the YARN Cluster
14.5. Beeswax Configuration Values
17.1. Flume 1.5.2 Dependencies
18.1. Required jaas.conf Sections for Cluster Nodes
18.2. Supported Authorizers
18.3. storm.yaml Configuration File Properties
18.4. worker-launcher.cfg File Configuration Properties
18.5. multitenant-scheduler.yaml Configuration File Properties
19.1. Spark Cluster Prerequisites
20.1. Kafka Configuration Properties
25.1. Host Group Parameters
25.2. Core and Monitoring Host Groups
25.3. Ecosystem Project Host Groups
27.1. Service Principals
27.2. Service Keytab File Names
27.3. General core-site.xml, Knox, and Hue
27.4. core-site.xml Master Node Settings -- Knox Gateway
27.5. core-site.xml Master Node Settings -- Hue
27.6. hdfs-site.xml File Property Settings
27.7. yarn-site.xml Property Settings
27.8. mapred-site.xml Property Settings
27.9. hbase-site.xml Property Settings -- HBase Server
27.10. hive-site.xml Property Settings
27.11. oozie-site.xml Property Settings
27.12. webhcat-site.xml Property Settings