Planning for the HDP Cluster
Also available as:
PDF

Conclusion

Achieving optimal results from a Hadoop implementation begins with choosing the correct hardware and software stacks. The effort involved in the planning stages can pay off dramatically in terms of the performance and the total cost of ownership (TCO) associated with the environment.

The following composite system stack recommendations can help benefit organizations in the planning stages:

Machine Type

Workload Pattern/ Cluster Type

Storage

Processor (# of Cores)

Memory (GB)

Network

Slaves

Balanced workload

Twelve 2-3 TB disks

8

128-256

1 GB onboard, 2x10 GBE mezzanine/external

Compute-intensive workload

Twelve 1-2 TB disks

10

128-256

1 GB onboard, 2x10 GBE mezzanine/external

Storage-heavy workload

Twelve 4+ TB disks

8

128-256

1 GB onboard, 2x10 GBE mezzanine/external

NameNode

Balanced workload

Four or more 2-3 TB RAID 10 with spares

8

128-256

1 GB onboard, 2x10 GBE mezzanine/external

ResourceManager

Balanced workload

Four or more 2-3 TB RAID 10 with spares

8

128-256

1 GB onboard, 2x10 GBE mezzanine/external

Note
Note
Reserve at least 2.5 GB of hard drive space for each version of HDP to be installed.