Administering HDFS
Also available as:
PDF

DistCp data copy matrix

To copy data from different versions of HDP clusters using DistCp, you must configure and make changes to the settings of the source and destination clusters.

The following table provides a summary of configuration, settings and results when using DistCp to copy data between different versions of HDP clusters:

From

To

Source Configuration

Destination Configuration

DistCp Should be Run on...

Result

HDP 1.3

HDP 2.x

insecure + hdfs

insecure + webhdfs

HDP 1.3 (source)

success

HDP 1.3

HDP 2.x

secure + hdfs

secure + webhdfs

HDP 1.3 (source)

success

HDP 1.3

HDP 2.x

secure + hftp

secure + hdfs

HDP 2.x (destination)

success

HDP 1.3

HDP 2.1

secure + hftp

secure + swebhdfs

HDP 2.1 (destination)

success

HDP 1.3

HDP 2.x

secure + hdfs

insecure + webhdfs

HDP 1.3 (source)

Possible issues

HDP 2.x

HDP 2.x

secure + hdfs

insecure + hdfs

secure HDP 2.x (source)

success

HDP 2.x

HDP 2.x

secure + hdfs

secure + hdfs

either HDP 2.x (source or destination)

success

HDP 2.x

HDP 2.x

secure + hdfs

secure + webhdfs

HDP 2.x (source)

success

HDP 2.x

HDP 2.x

secure + hftp

secure + hdfs

HDP 2.x (destination)

success

HDP 3.0.x

HDP 2.6.5

secure + hdfs

secure + hdfs

HDP 3.0.x (Source)

success

HDP 3.0.x

HDP 2.6.5

secure + webhdfs

secure + webhdfs

HDP 3.0.x (Source)

success

For the specified table:
  • The term "secure" means that Kerberos security is set up.

  • HDP 2.x means HDP 2.0 or later.