Dynamic Tag-Based Column Masking in Hive with Ranger Policies
Where Ranger resource-based masking policy for Hive anonymizes data from a Hive column identified by the database, table, and column, tag-based masking policy anonymizes Hive column data based on tags and tag attribute values associated with Hive column (usually specified as metadata classification in Atlas).
The following conditions apply when using Ranger column masking policies to mask data returned in Hive query results:
A variety of masking types are available, such as show last 4 characters, show first 4 characters, Hash, Nullify, and date masks (show only year).
You can specify a masking type for specific users, groups, and conditions.
Wildcard matching is not supported.
If there are multiple tag masking policies applied to the same Hive column, the masking policy with the lexicographically smallest policy-name is chosen for enforcement, E.G., policy "a" is enforced before policy "aa".
Masks are evaluated in the order listed in the policy.
An audit log entry is generated each time a masking policy is applied to a column.
Select Access Manager > Tag Based Policies, then select a tag-based
Select the Masking tab, then click Add New
On the Create Policy page, add the following information for
the column-masking filter:
Table 1. Policy Details Field Description
Set to Hive by default.
Enter an appropriate policy name. This name cannot be duplicated across the system. The policy is enabled by default.
Enter the applicable tag name, E.G., MASK.
Audit Logging Audit Logging is set to Yes by default. Select No to turn off audit logging. Description Enter an optional description for the policy. Table 2. Mask Conditions
Specify the groups to which this policy applies.
The public group contains all users, so granting access to the public group grants access to all users.
Select User Specify one or more users to which this policy applies. Policy Conditions Click Add Conditions to add or edit policy conditions. Currently "Accessed after expiry_date? (yes/no)" is the only available policy condition. To set this condition, type yes in the text box, then select the green check mark button to add the condition. Access Types Currently hive and select are the only available access types. Select Masking OptionTo create a row filter for the specified users and groups, click Select Masking Option, then select a masking type:
- Redact – mask all alphabetic characters with "x" and all numeric characters with "n".
- Partial mask: show last 4 – Show only the last four characters.
- Partial mask: show first 4 – Show only the first four characters.
- Hash – Replace all characters with a hash of entire cell value.
- Nullify – Replace all characters with a NULL value.
- Unmasked (retain original value) – No masking is applied.
- Date: show only year – Show only the year portion of a date string and default the month and day to 01/01
- Custom – Specify a custom masked value or expression. Custom masking can use any valid Hive UDF (Hive that returns the same data type as the data type in the column being masked).
Masking conditions are evaluated in the order listed in the policy. The condition at the top of the Masking Conditions list is applied first, then the second, then the third, and so on.
- You can use the Plus (+) symbols to add additional conditions. Conditions are evaluated in the order listed in the policy. The condition at the top of the list is applied first, then the second, then the third, and so on.
- Click Add to add the new policy.