6.2. Using the CLI for Cluster Configuration

Use the following instructions to manually configure the cluster properies for the HDP installer:

  1. Create a clusterproperties.txt file.

  2. Add the properties to the clusterproperties.txt file as described in the table below:

    [Important]Important
    • All properties in the clusterproperties.txt file must be separated by a newline character.

    • Directory paths cannot contain whitespace characters.

      For example, C:\Program Files\Hadoop is an invalid directory path for HDP.

    • Use Fully Qualified Domain Names (FQDN) for specifying the network host name for each cluster host. The FQDN is a DNS name that uniquely identifies the computer on the network. By default, it is a concatenation of the host name, the primary DNS suffix, and a period.

    • When specifying the host lists in the clusterproperties.txt file, if the hosts are multi-homed or have multiple NIC cards, make sure that each name or IP address by which you specify the hosts is the preferred name or IP address by which the hosts can communicate among themselves. In other words, these should be the addresses used internal to the cluster, not those used for addressing cluster nodes from outside the cluster.

    • To Enable NameNode HA, you must include the HA properties at the bottom of the following table.

    Configuration Values for MSI Installer

    Configuration Property Name Description Example value Mandatory/ Optional/ Conditional
    HDP_LOG_DIR HDP's operational logs are written to this directory on each cluster host. Ensure that you have sufficient disk space for storing these log files. d:\hadoop\logs Mandatory
    HDP_DATA_DIR HDP data will be stored in this directory on each cluster node. You can add multiple comma-separated data locations for multiple data directories. d:\hdp\data Mandatory
    NAMENODE_HOST The FQDN for the cluster node that will run the NameNode master service. NAMENODE_MASTER.acme.com Mandatory
    SECONDARY_NAMENODE_HOST The FQDN for the cluster node that will run the Secondary NameNode master service. SECONDARY_NN_MASTER.acme.com Mandatory
    RESOURCEMANAGER_HOST The FQDN for the cluster node that will run the YARN Resource Manager master service. RESOURCE_MANAGER.acme.com Mandatory
    HIVE_SERVER_HOST The FQDN for the cluster node that will run the Hive Server master service. HIVE_SERVER_MASTER.acme.com Mandatory
    OOZIE_SERVER_HOST The FQDN for the cluster node that will run the Oozie Server master service. OOZIE_SERVER_MASTER.acme.com Mandatory
    WEBHCAT_HOST The FQDN for the cluster node that will run the WebHCat master service. WEBHCAT_MASTER.acme.com Mandatory
    FLUME_HOSTS A comma-separated list of FQDN for those cluster nodes that will run the Flume service. FLUME_SERVICE1.acme.com, FLUME_SERVICE2.acme.com, FLUME_SERVICE3.acme.com Mandatory
    HBASE_MASTER The FQDN for the cluster node that will run the HBase master. HBASE_MASTER.acme.com Mandatory
    HBASE_REGIONSERVERS A comma-separated list of FQDN for those cluster nodes that will run the HBase Region Server services. slave1.acme.com, slave2.acme.com, slave3.acme.com Mandatory
    SLAVE_HOSTS A comma-separated list of FQDN for those cluster nodes that will run the DataNode and TaskTracker services. slave1.acme.com, slave2.acme.com, slave3.acme.com Mandatory
    ZOOKEEPER_HOSTS A comma-separated list of FQDN for those cluster nodes that will run the Zookeeper hosts. ZOOKEEEPER_HOST.acme.com Mandatory
    DB_FLAVOR Database type for Hive and Oozie metastores (allowed databases are SQL Server and Derby). To use default embedded Derby instance, set the value of this property to derby. To use an existing SQL Server instance as the metastore DB, set the value as mssql. mssql or derby Mandatory
    DB_HOSTNAME FQDN for the node where the metastore database service is installed. If using SQL Server, set the value to your SQL Server hostname. If using Derby for Hive metastore, set the value to HIVE_SERVER_HOST. sqlserver1.acme.com Mandatory
    DB_PORT This is an optional property required only if you are using SQL Server for Hive and Oozie metastores. By default, the database port is set to 1433. 1433 Optional
    HIVE_DB_NAME Database for Hive metastore. If using SQL Server, ensure that you create the database on the SQL Server instance. hivedb Mandatory
    HIVE_DB_USERNAME User account credentials for Hive metastore database instance. Ensure that this user account has appropriate permissions. hive_user Mandatory
    HIVE_DB_PASSWORD hive_pass Mandatory
    OOZIE_DB_NAME Database for Oozie metastore. If using SQL Server, ensure that you create the database on the SQL Server instance. ooziedb Mandatory
    OOZIE_DB_USERNAME User account credentials for Oozie metastore database instance. Ensure that this user account has appropriate permissions. oozie_user Mandatory
    OOZIE_DB_PASSWORD oozie_pass Mandatory
    HA Whether or not to deploy a highly available NameNode. yes or no Optional
    HA_JOURNALNODE_HOSTS A comma-separated list of FQDN for those cluster nodes that will run the JournalNode processes. journalnode1.acme.com, journalnode2.acme.com, journalnode3.acme.com Optional
    HA_CLUSTER_NAME This name will be used for both configuration and authority component of absolute HDFS paths in the cluster. hdp2-ha Optional
    HA_JOURNALNODE_EDITS_DIR This is the absolute path on the JournalNode machines where the edits and other local state used by the JournalNodes (JNs) are stored. You can only use a single path for this configuration. d:\hadoop\journal Optional
    HA_NAMENODE_HOST The host for the standby NameNode. STANDBY_NAMENODE.acme.com Optional

    The following snapshot illustrates a sample clusterproperties.txt file:

    #Log directory
    HDP_LOG_DIR=d:\hadoop\logs
    
    #Data directory
    HDP_DATA_DIR=d:\hdp\data
    
    #Hosts
    NAMENODE_HOST=NAMENODE_MASTER.acme.com
    SECONDARY_NAMENODE_HOST=SECONDARY_NAMENODE_MASTER.acme.com
    JOBTRACKER_HOST=JOBTRACKER_MASTER.acme.com
    HIVE_SERVER_HOST=HIVE_SERVER_MASTER.acme.com
    OOZIE_SERVER_HOST=OOZIE_SERVER_MASTER.acme.com
    WEBHCAT_HOST=WEBHCAT_MASTER.acme.com
    FLUME_HOSTS=FLUME_SERVICE1.acme.com,FLUME_SERVICE2.acme.com,FLUME_SERVICE3.acme.com
    HBASE_MASTER=HBASE_MASTER.acme.com
    HBASE_REGIONSERVERS=slave1.acme.com, slave2.acme.com, slave3.acme.com
    ZOOKEEPER_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com
    SLAVE_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com
    
    #Database host
    DB_FLAVOR=derby
    DB_HOSTNAME=DB_myHostName
    
    #Hive properties
    HIVE_DB_NAME=hive
    HIVE_DB_USERNAME=hive
    HIVE_DB_PASSWORD=hive
    
    #Oozie properties
    OOZIE_DB_NAME=oozie
    OOZIE_DB_USERNAME=oozie
    OOZIE_DB_PASSWORD=oozie
                  
                


loading table of contents...