Apache Storm Component Guide
Also available as:
PDF
loading table of contents...

Performance Guidelines for Developing a Storm Topology

The following table lists several general performance-related guidelines for developing Storm topologies.

Table 4.4. Storm Topology Development Guidelines

Guideline

Description

Read topology configuration parameters from a file.

Rather than hard coding configuration information in your Storm application, read the configuration parameters, including parallelism hints for specific components, from a file inside the main() method of the topology. This speeds up the iterative process of debugging by eliminating the need to rewrite and recompile code for simple configuration changes.

Use a cache.

Use a cache to improve performance by eliminating unnecessary operations over the network, such as making frequent external service or lookup calls for reference data needed for processing.

Tighten code in the execute() method.

Every tuple is processed by the execute() method, so verify that the code in this method is as tight and efficient as possible.

Perform benchmark testing to determine latencies.

Perform benchmark testing of the critical points in the network flow of your topology. Knowing the capacity of your data "pipes" provides a reliable standard for judging the performance of your topology and its individual components.