DLM Installation and Upgrade
Also available as:
PDF

DLM support requirements

Prior to installing Data Lifecycle Manager (DLM), you must consider various aspects of your HDP environment and prepare your clusters prior to DLM installation. The host on which you install DLM is the same host on which you install DataPlane Platform.

Support Matrix information

You can find the most current information about interoperability for this release on the Support Matrix. The Support Matrix tool provides information about:

  • Operating Systems

  • Databases

  • Browsers

  • JDKs

To access the tool, go to: https://supportmatrix.hortonworks.com.

DLM Host requirements

The DLM application is installed on the same host as DP Platform and has no requirements beyond what is required by DP Platform. See the DP Platform Support Requirements for details.

Requirements for clusters used with DLM Engine

  • The clusters on which you install the DLM Engine must meet the requirements identified in the following sections. After the DLM Engine is installed and properly configured on a cluster, the cluster can be registered with DPS and used for DLM replication.
Important
Important

Clusters used as source and destination in a DLM replication relationship must have exactly the same configurations for LDAP, Kerberos, Ranger, Knox, and HA.

  • DLM Engine can be installed on any server class machine starting with 1 GB memory. You may need to increase the memory to about 2 GB or 3 GB, depending on HDFS replication dataset. For example, number of files in the dataset and number of concurrent replication policies.

See the Support Matrix for supported operating systems and databases.

Port and network requirements for clusters

Have the following ports available and open on each cluster:

Default Port Number Purpose Comments Required to be open?
25968 Port for DLM Engine (Beacon) service on hosts

Accessibility is required from all clusters.

“Beacon” is the internal name for the DLM Engine. You will see the name Beacon in some paths, commands, etc.

Yes
8020 NameNode host Yes
50010 All DataNode hosts Yes
8080 Ambari server host Yes
10000 HiveServer2 host Binary mode port (Thrift) Yes
10001 HiveServer2 host HTTP mode port Yes
9083 Hive metastore Yes
2181 ZooKeeper hosts Yes
6080 Ranger port Yes
8050 YARN port Yes
21000 Atlas port Yes
6182 Secured Ranger Yes

HDP component requirements for DLM

The following additional Apache components might be required on your clusters for DLM support, depending on the security configuration and type of replication being performed:

Component Purpose Comments
HDFS For replicating HDFS data.
Knox Authentication federation from DPS Knox must be enabled on clusters before you can register the clusters with DPS.
Ranger Authorization on clusters during replication Ranger is optional for HDFS replication, but required for Hive replication.
YARN
Hive For replicating Hive database content Updates via Hive 1 (Based on Apache Hive 1.2.x) and HiveServerInteractive (Based on Apache Hive 2.1.x) are replicated. However, HiveServer2 from Hive 1 is always used for running the replication tasks.
HiveServer 2 Needed for Hive replication
Hive Metastore Needed for Hive replication
Zookeeper Needed for Hive
Atlas Needed for metadata replication