Apache ZooKeeper ACLs
Also available as:
PDF

ZooKeeper ACLs Best Practices: YARN Registry

Best practices for tightening the ZooKeeper ACLs/permissions for YARN Registry when provisioning a secure cluster.

The YARN registry is a location into which statically and dynamically deployed applications can register service endpoints; client applications can look up these entries to determine the URLs and IPC ports with which to communicate with a service.

It is implemented as a zookeeper tree: services register themselves as system services, under the registry path /system, or user services, which are registered under /users/USERNAME where USERNAME is the name of the user registering the service.

As the purpose of the mechanism is to allow arbitrary clients to look up a service, the entries are always world readable. No secrets should be added to service entries.

In insecure mode, all registry paths are world readable and writeable: nothing may be trusted.

In a secure cluster, the registry is designed to work as follows:
  1. Kerberos + SASL provides the identification and authentication.

  2. /system services can only be registered by designated system applications (YARN, HDFS, etc)/

  3. User-specific services can only be registered by the user deploying the application.

  4. If a service is registered under a user's path, it may be trusted, and any published public information (such as HTTPS certifications) assumed to have been issued by the user.

  5. All user registry entries should also be registered as world writeable with the list of system accounts defined in hadoop.registry.system.accounts; this is a list of ZK SASL-authenticated accounts to be given full access. This is needed to support system administration of the entries, especially automated deletion of old entries after application failures.

  6. The default list of system accounts are yarn, mapred, hdfs, and hadoop; these are automatically associated with the Kerberos realm of the process interacting with the registry, to create the appropriate sasl:account@REALM ZK entries.

  7. If applications are running from different realms, the configuration option hadoop.registry.kerberos.realm must be set to the desired realm, or hadoop.registry.system.accounts configured with the full realms of the accounts.

  8. There is support for ZooKeeper id:digest authentication; this is to allow a user's short-lived YARN applications to register service endpoints without needing the Kerberos TGT. This needs active use by the launching application (which must explicitly create a user service node with an id:digest permission, or by setting hadoop.registry.user.accounts, to the list of credentials to be permitted.

  9. System services must not use id:digest authentication —nor should they need to; any long-lived service already needs to have a kerberos keytab.

  10. The per-user path for their user services, /users/USERNAME , is created by the YARN resource manager when users launch services, if the RM is launched with the option hadoop.registry.rm.enabled set to true.

  11. When hadoop.registry.rm.enabled is true, the RM will automatically purge application and container service records when the applications and containers terminate.

  12. Communication with ZK is over SASL, using the java.security.auth.login.config system property to configure the binding. The specific JAAS context to use can be set in hadoop.registry.jaas.context if the default value, Client, is not appropriate.

ZK Paths and Permissions:

All paths are world-readable; permissions are set up when the RM creates the root entry and user paths and hadoop.registry.secure=true.
Path Role Permissions
/registry Base registry path yarn, hdfs, mapred, hadoop : cdrwa
/registry/system System services yarn, hdfs, mapred, hadoop : cdrwa
/registry/users Users yarn, hdfs, mapred, hadoop : cdrwa
/registry/users/USER The registry tree for the user USER.

USER: rwa

yarn, hdfs, mapred, hadoop : cdrwa

Configuration options for secure registry access
Name Recommended Value
hadoop.registry.secure true
hadoop.registry.rm.enabled true
hadoop.registry.system.accounts

sasl:yarn@, sasl:mapred@, sasl:hdfs@, sasl:hadoop@

Grants system accounts write access to the root registry paths. A tighter version would be sasl:yarn@ which will only give the RM the right to manipulate these, or explicitly declare a realm, such as sasl:yarn@EXAMPLE

hadoop.registry.kerberos.realm

(empty)

The Kerberos realm to use when converting the system accounts to full realms. If left empty, uses the realm of the user

hadoop.registry.user.accounts (empty)
hadoop.registry.client.auth

kerberos

How to authenticate with ZK. Alternative (insecure) options: anonymous, digest.

hadoop.registry.jaas.context

Client

The JAAS context to use for registry clients to authenticate with ZooKeeper.