Configure YARN for running Docker containers

Running Docker containers on YARN works very similar to running existing containers. Containers have access to files that are localized for the container as well as logging.

To facilitate the use of YARN features, a few rules need to be followed. For the example applications, these steps have already been taken care of.

The processes in the containers must run as the user submitting the application (or the local-user in insecure mode).
The mount whitelist must include the yarn.local.dirs so that the files needed for the application are available in the container.

The following configuration runs LinuxContainerExecutor in an insecure mode and is only used for testing or where use cases are highly controlled. Kerberos configurations are recommended for production. The local-user is assumed to be nobody, this means that all containers will run as the nobody user.

Make sure YARN cgroups are enabled before configruing YARN for running Docker containers.

To leverage YARN cgroup support, the nodemanager must be configured to use LinuxContainerExecutor. The Docker YARN integration also requires this container executor.

Set the following properties in the yarn-site.xml file.


<property>
    <description>The UNIX user that containers will run as when
    Linux-container-executor is used in nonsecure mode</description>
    <name>yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user</name>
    <value>nobody</value>
</property>
                        
<property>
    <description>Comma separated list of runtimes that are allowed when using
    LinuxContainerExecutor.</description>
    <name>yarn.nodemanager.runtime.linux.allowed-runtimes</name>
    <value>default,docker</value>
    </property>
                        
<property>
    <description>This configuration setting determines the capabilities
    assigned to docker containers when they are launched. While these may not
    be case-sensitive from a docker perspective, it is best to keep these
    uppercase. To run without any capabilities, set this value to
    "none" or "NONE"</description>
    <name>yarn.nodemanager.runtime.linux.docker.capabilities</name>
    <value>CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,
SETFCAP,SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE</value>
</property>
                        
<property>
    <description>This configuration setting determines if
    privileged docker containers are allowed on this cluster.
    The submitting user must be part of the privileged container acl and 
    must be part of the docker group or have sudo access to the docker command 
    to be able to use a privileged container. Use with extreme care.</description>
    <name>yarn.nodemanager.runtime.linux.docker.privileged-containers.allowed</name>
    <value>false</value>
</property>
                        
<property>
    <description>This configuration setting determines the submitting 
    users who are allowed to run privileged docker containers on this cluster. 
    The submitting user must also be part of the docker group or have sudo access
    to the docker command. No users are allowed by default. Use with extreme care. 
    </description>
    <name>yarn.nodemanager.runtime.linux.docker.privileged-containers.acl</name>
    <value> </value>
</property>
                        
<property>
    <description>The set of networks allowed when launching containers</description>
    <name>yarn.nodemanager.runtime.linux.docker.allowed-container-networks</name>
    <value>host,bridge</value>
</property>
                        
<property>
    <description>The network used when launching containers when no network is specified 
    in the request. This network must be one of the (configurable) set of allowed 
    container networks. The default is host, which may not be appropriate for multiple 
    containers on a single node, use bridge in that case. See docker networking for more.
    </description>
    <name>yarn.nodemanager.runtime.linux.docker.default-container-network</name>
    <value>host</value>
</property>

Set the following properties in a container-executor.cfg file.


yarn.nodemanager.local-dirs=<yarn.nodemanager.local-dirs from yarn-site.xml>
yarn.nodemanager.log-dirs=<yarn.nodemanager.log-dirs from yarn-site.xml>
yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=hdfs,yarn,mapred,bin
min.user.id=50
                                
[docker]
module.enabled=true
docker.binary=/usr/bin/docker
docker.allowed.capabilities=CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,
SETGID,SETUID,SETFCAP,SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE,
DAC_READ_SEARCH,SYS_PTRACE,SYS_ADMIN
docker.allowed.devices=
docker.allowed.networks=bridge,host,none
docker.allowed.ro-mounts=/sys/fs/cgroup,<yarn.nodemanager.local-dirs from yarn-site.xml>
docker.allowed.rw-mounts=<yarn.nodemanager.local-dirs from yarn-site.xml>,
<yarn.nodemanager.log-dirs from yarn-site.xml>
docker.privileged-containers.enabled=false
docker.trusted.registries=local,centos,hortonworks
docker.allowed.volume-drivers=

The details of the properties are as follows.


Configuration	Description
`yarn.nodemanager.linux-container-executor.group`	The Unix group of the NodeManager. It should match the yarn.nodemanager.linux-container-executor.group in the yarn-site.xml file.
`banned.users`	A comma-separated list of usernames who should not be allowed to launch applications. The default setting is: yarn, mapred, hdfs, and bin.
`min.user.id`	The minimum UID that is allowed to launch applications. The default is no minimum
`module.enabled`	Must be "true" or "false" to enable or disable launching Docker containers respectively. Default value is 0.
`docker.binary`	The binary used to launch Docker containers. /usr/bin/docker by default.
`docker.allowed.capabilities`	The minimum UID that is allowed to launch applications. The default is no minimum.
`docker.allowed.devices`	Comma separated devices that containers are allowed to mount. By default no devices are allowed to be added.
`docker.allowed.networks`	Comma separated networks that containers are allowed to use. If no network is specified when launching the container, the default Docker network will be used.
`docker.allowed.ro-mounts`	Comma separated directories that containers are allowed to mount in read-only mode. By default, no directories are allowed to mounted.
`docker.allowed.rw-mounts`	Comma separated directories that containers are allowed to mount in read-write mode. By default, no directories are allowed to mounted.
`docker.privileged-containers.enabled`	Set to "true" or "false" to enable or disable launching privileged containers. Default value is "false". The submitting user must be defined in the privileged container acl setting and must be part of the docker group or have sudo access to the docker command to be able to use a privileged container. Use with extreme care.
`docker.trusted.registries`	Comma separated list of trusted docker registries for running trusted privileged docker containers. By default, no registries are defined. If the image used for the application does not appear in this list, all capabilities, mounts, and privileges will be stripped from the container.
`docker.allowed.volume-drivers`	Comma separated volume drivers that containers are allowed to use.