Install the Data Plane Profiler Agent
DSS requires that the DP Profiler Agent be installed on all custers. The Profiler is installed on the Ambari host, using an Ambari management pack (MPack). An MPack bundles service definitions, stack definitions, and stack add-on service definitions.
Prior to starting installation, you must have downloaded the required repository tarballs from the Hortonworks customer portal, following the instructions provided as part of the product procurement process. The repository tarballs for the Data Plane Profiler agent are different from the DSS app repository tarballs.
- Log in as root to an Ambari host on a cluster.
Install the Data Plane Profiler MPack by running the following command, replacing
<mpack-file-name> with the name of the MPack.
ambari-server install-mpack --mpack <mpack-file-name> --verbose
- Restart the Ambari server.
Launch Ambari in a browser and log in.
http://<ambari-server-host>:8080Default credentials are:
- Username: admin
- Password: admin
- Click Admin>Manage Ambari.
Click Versions, and then do the following on the Versions
URLs shown are for example purposes only. Actual URLs might be different.
- Click the HDP version in the Name column.
Change the Base URL path for the DSS service to point
to the local repository, for example:
- Click the Ambari logo to return to the main Ambari page.
- In the Ambari Services navigation pane, click Actions>Add
The Add Service Wizard displays.
On the Choose Services page of the Wizard, select the
Dataplane Profiler service to install in Ambari, and then follow the
Other required services are automatically selected.
- When prompted to confirm addition of dependent services, give a positive confirmation to all.
This adds other required services.
- On the Assign Masters page, you can choose the default settings.
On the Customize Services page, fill out the database
details and other required fields that are highlighted.
Make sure to enter the credentials that you set while configuring the external database. Change the username profileragent to the values set in the external database.NoteMake sure to add the database driver to the machine based on the external database that you configured.
- Complete the remaining installation wizard steps and exit the wizard.
- Ensure that all components required for your DPS service have started successfully.
Enable Knox SSO for DP Profiler Agent.
dpprofiler.sso.knox.enabledto true in Advanced dpprofiler-env section in Ambari DP Profiler Configs.
- Run the following CLI command to export the Knox certificate:
JAVA_HOME/bin/keytool -export -alias gateway-identity -rfc -file knox-pub-key.cert -keystore /usr/hdp/current/knox-server/data/security/keystores/gateway.jks
When prompted, enter the Knox master password.
- After generating the certificate, paste
the contents of the certificate in the
dpprofiler.sso.knox.public.keyfield under Advanced dpprofiler-env properties of DP Profiler Configs in Ambari.
- Open the quick link of the profiler for service verification.
/profilersto the quick link URL.If the quick link is xyz:21900, change it to xyz:21900/profilers.NoteFor non-Kerberized clusters, this request returns the list of all registered profilers. For kerberos-enabled clusters where Knox is not enabled for DP Profiler Agent, you will see an HTTP-401 response which is expected.
After installing the profiler agent using Add Service Wizard in Ambari, the NodeManager hosts do not have the dpprofiler user.
For Ambari to automatically create these users, restart all NodeManagers by going to Services->YARN->Restart NodeManagers (NodeManagers can be restarted in a rolling fashion - Ambari UI shows restart batching options)
NoteDuring DP Profiler Agent installation, two new Atlas types -
dss_hive_table_profile_data, are registered. These types contain attributes to store metrics computed by DSS profilers. In addition, existing Atlas types
hive_columnare updated to add an additional attribute
profileDatais a reference to
dss_hive_table_profile_dataand for type
profileDatais a reference to
- If TDE zones are set up in the cluster and if any of the following locations fall within the TDE zones, the dpprofiler user must have Decrypt_EEK access to the Key/Keys used to encrypt that zone.
- all locations of Hive tables