DLM Installation and Upgrade
Also available as:
PDF

Preparing HDP cluster for Hive cloud replication

Hive replication from on-prem cluster to the cloud storage requires minimal cluster on the target with metadata services like HMS, Ranger, Atlas, and DLM engine. HMS should be configured with Hive warehouse directory on cloud storage. Refer to the following steps:

  1. Hive Data Locations - Hive metastore requires these specific configurations to point Hive data on cloud storage. Note that both hive.metastore.warehouse.dir and hive.repl.replica.functions.root.dir should be configured in the same bucket.
    
    hive.metastore.warehouse.dir=<cloud storage>
    hive.repl.replica.functions.root.dir=<cloud storage>
    hive.warehouse.subdir.inherit.perms=false
  2. Cloud access credentials - When Hive metastore is configured with Hive warehouse directory on cloud storage, Hive will also require the credentials to access the cloud storage. This can be setup with one of the following configurations:
    • Access key and secret key
    • Session token
    • For IAAS clusters, setup instance profiles
  3. Cloud encryption configurations - If the bucket is encrypted, setup the bucket encryption details
    Note
    Note

    Set all these configurations in hive-site.xml.