1. Machine roles in a typical Hadoop cluster

In Hadoop and HBase, the following two types of machines are available:

  • Masters (HDFS NameNode, Secondary NameNode, YARN ResourceManagers, and the HBase Master)

    [Note]Note

    It is recommended to add only limited number of disks to the master nodes, because the master nodes do not have high storage demands.

  • Slaves (HDFS DataNodes, YARN NodeManagers, and HBase RegionServers)

Additionally, we strongly recommend that you use separate client machines for performing the following tasks:

  • Load data in the HDFS cluster

  • Submit YARN applications(describing how to process the data)

  • Retrieve or view the results of the job after its completion

  • Submit Pig or Hive queries

Based on the recommended settings for the client machines, the following illustration provides details of a typical Hadoop cluster: