Chapter 11. Highly Available Reads with HBase

Apache HBase 0.98.0 enables HBase administrators to configure their HBase clusters with read-only High Availability, or HA. This feature greatly benefits HBase applications that require low latency queries but can tolerate stale data, such as remote sensors, distributed messaging, object stores, and user profile management. HBase provides read-only HA on a per-table basis by replicating table regions with one or more secondary region servers. See Understanding Regions and RegionServers for more information.

HA for HBase features the following functionality:

  • Data safely protected in HDFS

  • Failed nodes are automatically recovered

  • No single point of failure

However, HBase administrators should carefully consider the the following costs associated with using secondary region servers:

  • Double or triple memstore usage

  • Increased block cache usage

  • Increased network traffic for log replication

  • Extra backup RPCs for replicas

[Important]Important

This release of HA for HBase is not compatible with region splits and merges. Do not execute region merges on tables with region replicas. Rather, HBase administrators must split tables before enabling HA for HBase and disable region splits with the DisabledRegionSplitPolicy. This can be done with both the HBase API and with the hbase.regionserver.region.split.policy property in the region server's hbase-site.xml configuration file. This default value can be overridden for individual HBase tables.


loading table of contents...