Planning your deployment
Also available as:
PDF

Hardware Sizing Recommendations

Recommendations for Kafka

  • Kafka Broker node: eight cores, 64 GB to128 GB of RAM, two or more 8-TB SAS/SSD disks, and a 10- GbE NIC.

  • Minimum of three Kafka broker nodes

  • Hardware Profile: More RAM and faster speed disks are better; 10 GbE NIC is ideal.

  • 75 MB per sec per node is a conservative estimate. You can go much higher if more RAM and reduced lag between writing/reading and therefore 10 GB NIC is required.

With a minimum of 3 nodes in your cluster, you can expect 225 MB/sec data transfer.

You can perform additional further sizing by using the following formula: num_brokers = desired_throughput (MB/sec) / 75

Recommendations for Storm

  • Storm Worker Node: 8 core, 64 GB RAM, 1 GbE NIC

  • Minimum of 3 Storm worker nodes

  • Nimbus Node: Minimum 2 Nimbus nodes , 4 core, 8 GB RAM

  • Hardware profile: disk I/O is not that important; more cores are better.

  • 50 MB per sec per node with low to moderate complexity topology reading from Kafka and no external lookups . Medium-complexity and high-complexity topologies might have reduced throughput.

With a minimum 2 nimbus, 2 worker cluster, you can expect to run 100 MB/sec of low to medium complexity topology.

Further sizing can be done as follows. Formula: num_worker_nodes = desired_throughput(MB/sec) / 50

Recommendations for NiFi

NiFi is designed to take advantage of:
  • all the cores on a machine

  • all the network capacity

  • all the disk speed

  • many gigabytes of RAM (although usually not all) on a system

Hence, it is important that NiFi be running on dedicated nodes. Following are the recommended server and sizing specifications for NiFi:

  • Minimum of 3 nodes

  • 8+ cores per node (more is better)

  • 6+ disks per node (SSD or spinning)

  • At least 8 GB

If you want this sustained throughput… Then provide this minimum hardware ...
50 MB and thousands of events per second
  • 1 or 2 nodes

  • 8 or more cores per node, although more is better

  • 6 or more disks per node (SSD or spinning)

  • 2 GB memory per node

  • 1 GB bonded NICs

100 MB and tens of thousands of events per second
  • 3 or 4 nodes

  • 16 or more cores per node, although more is better

  • 6 or more disks per node (SSD or spinning)

  • 2 GB of memory per node

  • 1 GB bonded NICs

200 MB and hundreds of thousands of events per second
  • 5 to 7 nodes

  • 24 or more cores per node (effective CPUs)

  • 12 or more disks per node (SSD or spinning)

  • 4 GB of memory per node

  • 10 GB bonded NICs

400 to 500 MB/sec and hundreds of thousands of events per second
  • 7 - 10 nodes

  • 24 or more cores per node (effective CPUs)

  • 12 or more disks per node (SSD or spinning)

  • 6 GB of memory per node

  • 10 GB bonded NICs