Apache Hadoop High Availability
Also available as:
PDF
loading table of contents...

HBase Cluster Replication for Geographic Data Distribution

HBase provides a cluster replication mechanism which allows you to keep one cluster’s state synchronized with that of another cluster, using the write-ahead log (WAL) of the source cluster to propagate the changes. Some use cases for cluster replication include:

  • Backup and disaster recovery

  • Data aggregation

  • Geographic data distribution, such as data centers

  • Online data ingestion combined with offline data analytics

[Note]Note

Replication is enabled at the granularity of the column family. Before enabling replication for a column family, create the table and all column families to be replicated on the destination cluster.