6. Define Cluster Configuration

Use the following instructions to configure HDP installer for your cluster:

  1. Create a clusterproperties.txt file.

  2. Add the properties to the clusterproperties.txt file as described in the table given below:

    [Important]Important
    • Ensure that all the properties in the clusterproperties.txt file are separated by a new line character.

    • Ensure that the directory paths do not contain any whitespace character.

      For example, C:\Program Files\Hadoop is an invalid directory path for HDP.

    • Use Fully Qualified Domain Names (FQDN) for specifying the network host name for each cluster host. The FQDN is a DNS name that uniquely identifies the computer on the network. By default, it is a concatenation of the host name, the primary DNS suffix, and a period.

    • When specifying the host lists in the clusterproperties.txt file, if the hosts are multi-homed or have multiple NIC cards, make sure that each name or IP address by which you specify the hosts are the preferred name or IP address by which the hosts can communicate among themselves. In other words, these should be the addresses used internal to the cluster, not those used for addressing cluster nodes from outside the cluster.

    Table 1.5. Configuration values for MSI installer
    Configuration Property Name Description Example value Mandatory/Optional/Conditional
    HDP_LOG_DIR HDP's operational logs will be written to this directory on each cluster host. Ensure that you have sufficient disk space for storing these log files. d:\hadoop\logs Mandatory
    HDP_DATA_DIR HDP data will be stored in this directory on each cluster node. You can add multiple comma-separated data locations for multiple data directories. d:\hdp\data Mandatory
    NAMENODE_HOST The FQDN for the cluster node that will run the NameNode master service. NAMENODE_MASTER.acme.com Mandatory
    SECONDARY_NAMENODE_HOST The FQDN for the cluster node that will run the Secondary NameNode master service. SECONDARY_NN_MASTER.acme.com Mandatory
    JOBTRACKER_HOST The FQDN for the cluster node that will run the JobTracker master service. JOBTRACKER_MASTER.acme.com Mandatory
    HIVE_SERVER_HOST The FQDN for the cluster node that will run the Hive Server master service. HIVE_SERVER_MASTER.acme.com Mandatory
    OOZIE_SERVER_HOST The FQDN for the cluster node that will run the Oozie Server master service. OOZIE_SERVER_MASTER.acme.com Mandatory
    TEMPLETON_HOST The FQDN for the cluster node that will run the Templeton master service. TEMPLETON_MASTER.acme.com Mandatory
    SLAVE_HOSTS A comma separated list of FQDN for those cluster nodes that will run the DataNode and TaskTracker services. slave1.acme.com, slave2.acme.com, slave3.acme.com Mandatory
    DB_FLAVOR Database type for Hive and Oozie metastores (allowed databases are SQL Server and Derby). To use default embedded Derby instance, set the value of this property to derby. To use an existing SQL Server instance as the metastore DB, set the value as mssql. mssql or derby Mandatory
    DB_HOSTNAME FQDN for the node where the metastore database service is installed. If using SQL Server, set the value to your SQL Server hostname. If using Derby for Hive metastore, set the value to HIVE_SERVER_HOST. sqlserver1.acme.com Mandatory
    DB_PORT This is an optional property required only if you are using SQL Server for Hive and Oozie metastores. By default, database port is set to 1433. 1433
    HIVE_DB_NAME Database for Hive metastore. If using SQL Server, ensure that you create the database on the SQL Server instance. hivedb Mandatory
    HIVE_DB_USERNAME User account credentials for Hive metastore database instance. Ensure that this user account has appropriate permissions. hive_user Mandatory
    HIVE_DB_PASSWORD hive_pass Mandatory
    OOZIE_DB_NAME Database for Oozie metastore. If using SQL Server, ensure that you create the database on the SQL Server instance. ooziedb Mandatory
    OOZIE_DB_USERNAME User account credentials for Oozie metastore database instance. Ensure that this user account has appropriate permissions. oozie_user Mandatory
    OOZIE_DB_PASSWORD oozie_pass Mandatory

    The following snapshot illustrates a sample clusterproperties.txt file:

    #Log directory
    HDP_LOG_DIR=d:\hadoop\logs
    
    #Data directory
    HDP_DATA_DIR=d:\hdp\data
    
    #Hosts
    NAMENODE_HOST=NAMENODE_MASTER.acme.com
    SECONDARY_NAMENODE_HOST=SECONDARY_NAMENODE_MASTER.acme.com
    JOBTRACKER_HOST=JOBTRACKER_MASTER.acme.com
    HIVE_SERVER_HOST=HIVE_SERVER_MASTER.acme.com
    OOZIE_SERVER_HOST=OOZIE_SERVER_MASTER.acme.com
    TEMPLETON_HOST=TEMPLETON_MASTER.acme.com
    SLAVE_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com
    
    #Database host
    DB_FLAVOR=derby
    DB_HOSTNAME=DB_myHostName
    
    #Hive properties
    HIVE_DB_NAME=hive
    HIVE_DB_USERNAME=hive
    HIVE_DB_PASSWORD=hive
    
    #Oozie properties
    OOZIE_DB_NAME=oozie
    OOZIE_DB_USERNAME=oozie
    OOZIE_DB_PASSWORD=oozie