Load Balance Strategy
To distribute the data in a flow across the nodes in the cluster, NiFi offers the
following load balance strategies:
Do not load balance: Do not load balance FlowFiles between nodes in the
cluster. This is the default.
Partition by attribute: Determines which node to send a given FlowFile to
based on the value of a user-specified FlowFile Attribute. All FlowFiles that have
the same value for the Attribute will be sent to the same node in the cluster. If
the destination node is disconnected from the cluster or if unable to communicate,
the data does not fail over to another node. The data will queue, waiting for the
node to be available again. Additionally, if a node joins or leaves the cluster
necessitating a rebalance of the data, consistent hashing is applied to avoid
having to redistribute all of the data.
Round robin: FlowFiles will be distributed to nodes in the cluster in a
round-robin fashion. If a node is disconnected from the cluster or if unable to
communicate with a node, the data that is queued for that node will be
automatically redistributed to another node(s).
Single node: All FlowFiles will be sent to a single node in the cluster.
Which node they are sent to is not configurable. If the node is disconnected from
the cluster or if unable to communicate with the node, the data that is queued for
that node will remain queued until the node is available again.
In addition to the UI settings, there are Cluster Node Properties related to load
balancing that must also be configured in nifi.properties.
NiFi persists the nodes that are in a cluster across restarts. This prevents the
redistribution of data until all of the nodes have connected. If the cluster is shutdown
and a node is not intended to be brought back up, the user is responsible for removing
the node from the cluster via the "Cluster" dialog in the UI (see Managing Nodes for
Load Balance Compression
After selecting the load balance strategy, the user can configure whether or not data
should be compressed when being transferred between nodes in the cluster.
The following compression options are available:
Do not compress: FlowFiles will not be compressed. This is the default.
Compress attributes only: FlowFile attributes will be compressed, but
FlowFile contents will not.
Compress attributes and content: FlowFile attributes and contents will be
Load Balance Indicator
When a load balance strategy has been implemented for a connection, a load balance
indicator () will
appear on the connection:
Hovering over the icon will display the connection's load balance strategy and
compression configuration. The icon in this state also indicates that all data in the
connection has been distributed across the cluster.
When data is actively being transferred between the nodes in the cluster, the load
balance indicator will change orientation and color:
Cluster Connection Summary
To see where data has been distributed among the cluster nodes, select Summary from the
Global Menu. Then select the "Connections" tab and the "View Connection Details" icon
for a source:
This will open the Cluster Connection Summary dialog, which shows the data on each node
in the cluster: