2. Define Cluster Details

  1. Use hostname -f to identify the FQDN for all the host machines.

    [Note]Note

    If you are deploying on Amazon EC2, use the Internal FQDN.

  2. On the master-install-location, change directory to master-install-location/gsInstaller.

  3. Create the following flat text files:

    [Note]Note

    The mandatory files are required for minimal install (Apache Hadoop core components). The optional files are needed if you wish to install that component (for example, HBase, Hive, WebHCat, etc.) in your cluster.

    • Mandatory files: gateway, namenode, snamenode, jobtracker, nodes

      [Note]Note

      The nodes file is used to define the DataNodes and TaskTrackers.

    • Optional files: hbasemaster, hivemetastore, webhcatnode, nagiosserver, gangliaserver, oozieserver, hbasenodes, zknodes

      [Note]Note

      The hbasenodes file is used to define the RegionServers for your HBase cluster.

  4. Provide FQDN of your host machines in each these text files:

    • Option I (single node installations): Provide the FQDN of the same host machine for all of the text files.

    • Option II (multi node installations):

      1. For the following files, provide FQDN of EXACTLY one host machine:

        gateway, namenode, snamenode, jobtracker, hbasemaster, hivemetastore, oozieserver, webhcatnode, nagiosserver, gangliaserver.

      2. For the following files, provide FQDN (separated by a new-line character) for a MINIMUM of three host machines::

        nodes, hbasenodes

      3. For the zknodes file, provide FQDN for a MINIMUM of one host machine.:

        [Note]Note

        Multiple host machines must follow the Zookeeper ensemble rule.


loading table of contents...