Support Matrix for DLM Installation
Prior to installing Data Lifecycle Manager (DLM), you must consider various aspects of your HDP environment and prepare your clusters prior to DLM installation. The host on which you install DLM is the same host on which you install all DPS Platform.
DLM Host requirements
The DLM application is installed on the same host as DPS Platform and has no requirements beyond what is required by DPS Platform. See the DPS Support Matrix for details.
Requirements for clusters used with DLM Engine
The clusters on which you install the DLM Engine must meet the requirements identified in the following sections. After the DLM Engine is installed and properly configured on a cluster, the cluster can be registered with DPS and used for DLM replication.
Clusters used as source and destination in a DLM replication relationship must have exactly the same configurations for LDAP, Kerberos, Ranger, Knox, and HA.
|JDK||JDK 8 (Open JDK & Oracle JDK)|
Port and network requirements for clusters
Have the following ports available and open on each cluster:
|Default Port Number||Purpose||Comments||Required to be open?|
|25968||Port for DLM Engine (Beacon) service on hosts||
Accessibility is required from all clusters.
“Beacon” is the internal name for the DLM Engine. You will see the name Beacon in some paths, commands, etc.
|50010||All DataNode hosts||Yes|
|8080||Ambari server host||Yes|
|10000||HiveServer2 host||Binary mode port (Thrift)||Yes|
|10001||HiveServer2 host||HTTP mode port||Yes|
HDP component requirements for DLM
The following additional Apache components might be required on your clusters for DLM support, depending on the security configuration and type of replication being performed:
|HDFS||For replicating HDFS data.|
|Knox||Authentication federation from DPS||Knox must be enabled on clusters before you can register the clusters with DPS.|
|Ranger||Authorization on clusters during replication||Ranger is optional for HDFS replication, but required for Hive replication.|
|Hive||For replicating Hive database content||Updates via Hive 1 (Based on Apache Hive 1.2.x) and HiveServerInteractive (Based on Apache Hive 2.1.x) are replicated. However, HiveServer2 from Hive 1 is always used for running the replication tasks.|
|HiveServer 2||Needed for Hive replication|
|Hive Metastore||Needed for Hive replication|
|Zookeeper||Needed for Hive|