DLM Administration
Also available as:
PDF

Chapter 1. Hortonworks Data Lifecycle Manager Terminology

You should be familiar with the terminology used in Hortonworks DataPlane Service (DPS) and in the Data Lifecycle Manager (DLM) service that interfaces with the DPS infrastructure.

DPS Platform

DPS Platform is a UI service platform from which you can manage and monitor various services on multiple Hortonworks Data Platform (HDP) Hadoop clusters. You can install DPS Platform on an HDP cluster or remote to the cluster.

Hortonworks DataPlane Service (DPS)

The family of components that include the DPS Platform service platform and all services that plug into it.

service

An autonomous component in the DPS environment. DPS Platform is a service, as is each component that is enabled and managed through DPS Platform, such as Data Lifecycle Manager (DLM). DPS Platform and each of its plugin services must be installed as Docker containers.

cluster

A typical HDP Hadoop cluster. See the Cluster Planning guide for details.

The cluster hosts the various Systems of Record (SoRs) for metadata (Apache Hive, Apache Atlas, Apache Ranger, HDFS, and so on) that DPS Platform and associated plugin services rely on. In an on-premise environment, a cluster often equates to a data center. However, a single data center can contain multiple HDP Hadoop clusters.

Data Lifecycle Manager (DLM) Service

DLM is a UI service that is enabled through DPS Platform. From the DLM UI you can create and manage replication and disaster recovery policies and jobs.

DLM Engine

Also referred to as the Beacon engine, this is the replication engine that is required for Data Lifecycle Manager. The DLM Engine must be installed as a management pack on each cluster that is to be used in data replication jobs. The engine maintains, in a configured database, information about clusters and policies that are involved in replication.

data center

The facility that contains the computer, server, and storage systems and associated infrastructure, such as routers, switches, and so forth. Corporate data is stored, managed, and distributed from the data center. In an on-premise environment, a data center is often composed of a single Hadoop cluster.

policy

A set of rules applied to a replication relationship. The rules include which clusters serve as source and destination, the type of data to replicate, the schedule for replicating data, and so on.

job

An instance of a policy that is running or has run.