Using Apache HBase to store and access data
Also available as:
PDF
loading table of contents...

Command for creating HBase backup image

Use hbase backup create command as hbase superuser to create a complete backup image.

Ensure that backup is enabled on the cluster. To enable backup, add the following properties to hbase-site.xml and restart the HBase cluster.

<property>
<name>hbase.backup.enable</name>
<value>true</value>
</property>
<property>
<name>hbase.master.logcleaner.plugins</name>
<value>YOUR_PLUGINS,org.apache.hadoop.hbase.backup.master.BackupLogCleaner
</value>
</property>
<property>
<name>hbase.procedure.master.classes</name>
<value>YOUR_CLASSES,org.apache.hadoop.hbase.backup.master.
LogRollMasterProcedureManager</value>
</property>
<property>
<name>hbase.procedure.regionserver.classes</name>
<value>YOUR_CLASSES,org.apache.hadoop.hbase.backup.regionserver.
LogRollRegionServerProcedureManager</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>YOUR_CLASSES,org.apache.hadoop.hbase.backup.BackupObserver</value>
</property>
Following is the usage of the hbase backup create command with its arguments:
hbase backup create <type> <backup_path> [options] 

Required command-line arguments

type

It specifies the type of backup to execute, which can be full or incremental.

Using the full argument creates a full backup image. Using the incrementalargument creates an incremental backup image. It requires a full backup to already exist.

backup_path

The backup_path argument specifies the full root path of where to store the backup image. Valid prefixes are hdfs:, webhdfs:, gpfs:, and s3fs:.

Optional command-line arguments

-b <arg>bandwidth_per_task

Specifies the bandwidth of each MapReduce task in MB per second.

-d <arg>

Enables DEBUG mode, which prints additional logging about the backup creation.

-q <arg>

It allows you to specify the Yarn queue name to run the backup create command on.

-s <arg>

Identify the tables to backup based on a backup set. Refer "Using Backup Sets" for the purpose and usage of backup sets. It is mutually exclusive with the -t (table list) option.

-t <arg>

A comma-separated list of tables to back up. If no tables are specified, all tables are backed up. No regular-expression or wildcard support is present; all table names must be explicitly listed. It is mutually exclusive with the -s option. One of these named options are required.

-w <arg>

Specifies the number of parallel MapReduce tasks to execute.