1.8. Creating Auth-to-Local Rules

To accommodate more complex translations, you can create a hierarchical set of rules to add to the default. Each rule is divided into three parts: base, filter, and substitution.

The Base
The base begins with the number of components in the principal name (excluding the realm), followed by a colon, and the pattern for building the username from the sections of the principal name. In the pattern section $0 translates to the realm, $1 translates to the first component and $2 to the second component.
For example: [1:$1@$0] translates myusername@EXAMPLE.COM to myusername@EXAMPLE.COM [2:$1] translates myusername/admin@EXAMPLE.COM to myusername [2:$1%$2] translates myusername/admin@EXAMPLE.COM to “myusername%admin
The Filter
The filter consists of a regular expression (regex) in a parentheses. It must match the generated string for the rule to apply.
For example: (.*%admin) matches any string that ends in %admin (.*@SOME.DOMAIN) matches any string that ends in @SOME.DOMAIN
The Substitution
The substitution is a sed rule that translates a regex into a fixed string.
For example: s/@ACME\.COM// removes the first instance of @SOME.DOMAIN s/@[A-Z]*\.COM// removes the first instance of @ followed by a name followed by COM. s/X/Y/g replaces all of X's in the name with Y

Examples

If your default realm was EXAMPLE.COM, but you also wanted to take all principals from ACME.COM that had a single component joe@ACME.COM, the following rule would do this:
RULE:[1:$1@$0](.@ACME.COM)s/@.// DEFAULT
To translate names with a second component, you could use these rules:
RULE:[1:$1@$0](.@ACME.COM)s/@.// RULE:[2:$1@$0](.@ACME.COM)s/@.// DEFAULT
To treat all principals from EXAMPLE.COM with the extension /admin as admin, your rules would look like this:
RULE[2:$1%$2@$0](.%admin@EXAMPLE.COM)s/./admin/ DEFAULT

After your mapping rules have been configured and are in place, Hadoop uses those rules to map principals to UNIX users. By default, Hadoop uses the UNIX shell to resolve a user’s UID, GID, and list of associated groups for secure operation on every node in the cluster. This is because in a kerberized cluster, individual tasks run as the user who submitted the application. In this case, the user’s identity is propagated all they way down to local JVM processes to ensure tasks are run as the user who submitted them. For this reason, typical enterprise customers choose to use technologies such as PAM, SSSD, Centrify, or other solutions to integrate with a corporate directory. As Linux is commonly used in the enterprise, there is most likely an existing enterprise solution that has been adopted for your organization. The assumption going forward is that such a solution has been integrated successfully, so logging into each individual DataNode using SSH can be accomplished using LDAP credentials, and typing in id results in a UID, GID, and list of associated groups being returned.

	Note
	If you use Hue, you must install and configure Hue manaully, after running the Kerberos wizard. For information about installing Hue manually, see Installing Hue .

Legal notices