DSS Installation
Also available as:

Pre-installation tasks for Data Plane Profiler Agent

Perform these tasks before you try to install the Data Profiler agent on the cluster.

  1. Ensure that the clusters are running the latest version of HDP.
  2. Ensure that the following HDP components are installed and configured:
    • Atlas
    • Ranger
    • Knox
    • Spark2 and Livy Server2
  3. If you plan to sync users from LDAP into Ranger, ensure a dpprofiler user is created in LDAP and synced into Ranger.
  4. Make sure that HDFS Audit logging for Ranger is enabled.
  5. Add the following proxy users details in the custom core-site.xml file as follows:
    hadoop.proxyuser.livy.groups=* hadoop.proxyuser.livy.hosts=* hadoop.proxyuser.knox.groups=* hadoop.proxyuser.knox.hosts=*
  6. If the cluster is kerberos-enabled, go to the Kerberos configuration section in Ambari and look up the value of the global property called principal suffix. Go to the Spark2 service and access the Custom livy2-conf section and add this property.
  7. Ensure that the following configuration is set up in Spark2 for cleaning up history files without filling up HDFS space over time.
    1. Log in to Ambari on the cluster.
    2. Select Spark2 > Configs > Custom spark2-defaults.
    3. Add the following lines:
      spark.history.fs.cleaner.enabled=true spark.history.fs.cleaner.interval=1d spark.history.fs.cleaner.maxAge=7d This ensures that Spark history from jobs older than seven days will be cleaned up once per day. Modify the values as needed.
  8. Restart the services as required.