Using TDE with DLM
Encryption with Transparent Data Encryption (TDE) is supported in DLM for protecting data at rest. You can use TDE to prevent people from inappropriately gaining access to your data. The source root directory must be either encrypted or unencrypted. DLM Engine does not support replication when part of the data is unencrypted and part encrypted with one or more keys.
Replication scenarios for TDE-enabled data
DLM supports replication of HDFS and Hive data when:
- Both source and destination are encrypted with the same key (on-premise to on-premise replication only)
- Both source and destination are encrypted with different keys
- Source is unencrypted, but destination is encrypted
Note that DLM does not allow replication when the source is encrypted, but the destination is unencrypted.
- Same Key
- If the source and destination are encrypted with the same key, DLM engine optimizes replication by replicating the encrypted blocks without decrypting and re-encrypting data during replication.
- Different Key
- During replication, source data is decrypted using the source key and encrypted using the destination key.
TDE in HDFS
- TDE should be configured in the HDFS service, and the directories have to be marked as
encryption zones using the encryption keys.
Refer to the Data Protection: HDFS Encryption in the HDP Security guide for more information.
- You can set TDE per directory or per cluster on HDFS.
TDE with Hive
- For Hive replication in DLM, any cluster that is using TDE and acts as a source for replication must have the entire data warehouse in a single encryption zone.
- You can set TDE only at cluster level for Hive replication.