DSS Administration
Also available as:
PDF

Step 1 Create Tag Schema files

Assign tags to columns when the rules match column names or the majority of values in the column. These labels also have associated metadata such as the percentage of values in a column that should match to tag the column with a label.

Define tags with the .kraptr_tag.json extension.

You must define the tags in one or more multiple tag schema files in the following path:

/apps/dpprofiler/profilers/sensitive_info_profiler/${profiler_version}/lib/kraptr/tags/

Here is an example of a schema file. This schema defines two tags: luhns_demo and column_name_only_demo.

{
 "groupName": "demo",
 "tags": [
   {
     "tag": "luhns_demo",
     "nameWeight": 20,
     "valueWeight": 80,
     "isEnabled": true
   }
   {
     "tag": "column_name_only_demo",
     "nameWeight": 82,
     "valueWeight": 18,
     "isEnabled": true
   }
 ]
}

    
    

It is highly recommended to keep the groupName unique in the multiple schema files. This can be useful in modularizing different sets of tags.

  1. tag - the label which will appear in DSS for this column. This label should be unique.
  2. nameWeight - Percentage weight for column name. In case of values like age where identifying label based on value is extremely difficult, you can define a DSL for identifying column names close to age and its synonyms and keep the nameWeight value high to make sure that the column will be tagged when the name comes.
  3. valueWeight - Percentage weight for column value. The match percentage among all rows in table is considered. Actual percentage attributed to the column for a tag will be computed as follows.
    1. actualValueWeight =valueWeight * number_of_rows_matched/ total_number_of_rows_in_sample
  4. isEnabled - If this is set to false, this tag will not be identified.