Pre-installation tasks for Data Plane Profiler Agent
Perform these tasks before you try to install the Data Profiler agent on the cluster.
- Ensure that the clusters are running the latest version of HDP.
Ensure that the following HDP components are installed and configured:
- Spark2 and Livy Server2
- If you plan to sync users from LDAP into Ranger, ensure a dpprofiler user is created in LDAP and synced into Ranger.
- Make sure that HDFS Audit logging for Ranger is enabled.
Add the following proxy users details in the custom core-site.xml file as follows:
If the cluster is kerberos-enabled, go to the Kerberos configuration section in Ambari and look up the value of the global property called principal suffix. Go to the Spark2 service and access the Custom livy2-conf section and add this property.
Ensure that the following configuration is set up in Spark2 for cleaning up history files without filling up HDFS space over time.
- Log in to Ambari on the cluster.
- Select Spark2 > Configs > Custom spark2-defaults.
Add the following lines:
spark.history.fs.cleaner.maxAge=7dThis ensures that Spark history from jobs older than seven days will be cleaned up once per day. Modify the values as needed.
- Restart the services as required.