Using Apache HBase to store and access data
Also available as:
loading table of contents...

Using Hive to access an existing HBase table example

Use the following steps to access the existing HBase table through Hive.

  • You can access the existing HBase table through Hive using the CREATE EXTERNAL TABLE:
    CREATE EXTERNAL TABLE hbase_table_2(key int, value string) 
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key
    TBLPROPERTIES("" = "some_existing_table", "hbase.mapred.output.outputtable" = "some_existing_table");
  • You can use different type of column mapping to map the HBase columns to Hive:
    • Multiple Columns and Families

      To define four columns, the first being the rowkey: “:key,cf:a,cf:b,cf:c”

    • Hive MAP to HBase Column Family

      When the Hive datatype is a Map, a column family with no qualifier might be used. This will use the keys of the Map as the column qualifier in HBase: “cf:”

    • Hive MAP to HBase Column Prefix

      When the Hive datatype is a Map, a prefix for the column qualifier can be provided which will be prepended to the Map keys: “cf:prefix_.*”

      Note: The prefix is removed from the column qualifier as compared to the key in the Hive Map. For example, for the above column mapping, a column of “cf:prefix_a” would result in a key in the Map of “a”.

  • You can also define composite row keys. Composite row keys use multiple Hive columns to generate the HBase row key.
    • Simple Composite Row Keys

      A Hive column with a datatype of Struct will automatically concatenate all elements in the struct with the termination character specified in the DDL.

    • Complex Composite Row Keys and HBaseKeyFactory

      Custom logic can be implemented by writing Java code to implement a KeyFactory and provide it to the DDL using the table property key “hbase.composite.key.factory”.