The profiler engine runs data profiling operations as a pipeline on data located in multiple data lakes. These profilers create metadata annotations that summarize the content and shape characteristics of the data assets.
|A Hive column univariate statistical profiler.|
|Hive Metastore||hive_metastore_profiler||Retrieves information about the number of hive tables that have been added every day.|
|Sensitive||sensitiveinfo||A sensitive data profiler- PII, PCI, HIPAA, etc.|
|Ranger Audit||audit||A Ranger audit log summarizer.|
You can edit some of the profiler configurations in Ambari via the Datalake Profiler component. Currently, you can only use pre-built profilers. You can only schedule profilers during installation.