DSS Administration
Also available as:
PDF

Managing Profiles

The profiler engine runs data profiling operations as a pipeline on data located in multiple data lakes. These profiles create metadata annotations that summarizes the content and shape characteristics for data assets.

Table 1. List of built-in profilers
Name Profiler Description
Hive Column

tablestats

hivecolumn

A Hive column univariate statistical profiler.
Hive Metastore hive_metastore_profiler Retrieves information about the number of hive tables that have been added every day.
Sensitive sensitiveinfo A sensitive data profiler- PII, PCI, HIPAA, etc.
Ranger Audit audit A Ranger audit log summarizer.

You can edit some of the profiler configurations in Ambari via the Datalake Profiler component. Currently, you can only use pre-built profilers. You can only schedule profilers during install.