Installing Apache Atlas
Also available as:
PDF

Migrate Atlas metadata when upgrading to HDP-3.0+

Perform the following steps to migrate the Atlas metadata from Titan to JanusGraph when upgrading from HDP-2.x to HDP-3.0 and higher versions.

  1. Before upgrading HDP and Atlas, use one of the following methods to determine the size of the Atlas metadata on the HDP-2.x cluster.
    • Click SEARCH on the Atlas web UI, then slide the green toggle button from Basic to Advanced. Enter the following query in the Search by Query box, then click Search.
      Asset select count() 
    • Run the following Atlas metrics REST API query:
      curl -g -X GET -u admin:admin -H "Content-Type: application/json" /
      -H "Cache-Control: no-cache" "http://<atlas_server>:21000/api/atlas/admin/metrics"

    Either of these methods returns the number of Atlas entities, which can be used to estimate the time required to export the Atlas metadata from HDP-2.x and import it into HDP-3.x. This time varies depending on the cluster configuration. The following estimates are for a node with a 4 GB RAM quad-core processor with both the Atlas and Solr servers on the same node:

    • Estimated duration for export from HDP-2.x: 2 million entities per hour.
    • Estimated duration for import into HDP-3.x: 0.75 million entities per hour.
  2. Before upgrading HDP and Atlas, perform the following steps on the HDP-2.x cluster.
    1. Register the HDP target version as described in "Register and Install Target Version" in the Ambari Upgrade guide. Do not install the target version – only perform the steps to register the target version. This downloads the HDP-3.0 files.
    2. On the Ambari dashboard, click Atlas, then select Actions > Stop.
    3. Use the HDP-3.0 exporter tool to run the export. Typically the tool is located at /usr/hdp/3.0.0.0-<build_number>/atlas/tools/migration-exporter. Use the following command format to start the exporting the Atlas metadata:
      python /usr/hdp/3.0.0.0-<build_number>/atlas/tools/migration-exporter/atlas_migration.py -d <output directory>

      While running, the Atlas migration tool prevents Atlas use, and blocks all REST APIs and Atlas hook notification processing.

      As described previously, the time it takes to export the Atlas metadata depends on the number of entities and your cluster configuration. You can use the following command to display the export status:

      tail -f /var/log/atlas/atlas-migration-exporter.log

      When the export is complete, the data is placed in the specified output directory.

    4. On the Ambari dashboard, Select Atlas > Configs > Advanced > Custom application-properties. Click Add Property, then add an atlas.migration.data.filename property and set its value to point to the full path to the atlas-migration-data.json file in the output folder you specified when you exported the HDP-2.x data.
  3. Upgrade HDP and Atlas.
  4. The upgrade starts Atlas automatically, which initiates the migration of the uploaded HDP-2.x Atlas metadata into HDP-3.x. During the migration import process, Atlas blocks all REST API calls and Atlas hook notification processing.
    You can use the following Atlas API URL to display the migration status:
    http://<atlas_server>:21000/api/atlas/admin/status

    The migration status is displayed in the browser window:

    {"Status":"Migration","currentIndex":139,"percent":67,"startTimeUTC":"2018-04-06T00:54:53.399Z"}
  5. When the migration is complete, select Atlas > Configs > Advanced > Custom application-properties, then click the red Remove button to remove the atlas.migration.data.filename property.