DLM Administration
Also available as:
PDF
loading table of contents...

Replication of data from Amazon S3 to on-premise in HDFS

You must create a new replication policy to replicate data from Amazon S3 cloud storage to on-premise.

Before you create a new replication policy, you must register Amazon S3 cloud account. For more information, see Register cloud credentials. You must have Infra Admin or DLM Admin role to perform this set of tasks.
  1. Select Policies and click Add Policy. By default, HDFS is selected as the service in the Create Replication Policy page.
  2. Enter the replication policy name and description.
  3. Click SELECT SOURCE.
  4. Select type as S3 and Cloud Credential from the drop-down and enter the S3 source path bucket_name/path.
  5. Click SELECT DESTINATION.

    Make sure you have one or more clusters in the DLM application.

  6. Select type as cluster and destination cluster from the drop-down.
  7. Enter the destination path and click VALIDATE.
  8. Once the validation is successful, click SCHEDULE.
  9. Configure the job settings for the replication policy.
  10. Click ADVANCED SETTINGS to set up the policy queue.
  11. Click CREATE POLICY.

    The data replication process is enabled.

    View job status from the policies page. Verify that the job starts and runs as expected.