Recommendations for running Docker containers on YARN

Docker Version

1.12.5 is the minimum recommended version. Docker is rapidly evolving and shipping multiple releases per year. Not all versions of Docker have been tested. Docker versioning changed in 2017, and is now known as Docker CE. Running a recent version of Docker CE is recommended. Note that recent versions of Docker CE have switched to using the overlay2 storage driver which may not work for all workloads.

RHEL/CentOS provides a version of Docker that can be installed via yum.

Storage Driver

Selecting a storage driver is dependent on OS kernel, workload, and Docker version. It is highly recommended that administrators read the documentation, consult with their operating system vendor, and test the desired workload before making a determination.

Testing has shown that device mapper using LVM is generally stable. Under high write load to the container’s root filesystem, device mapper has exhibited panics. SSDs for the Docker graph storage are recommended in this case, but care still needs to be taken. Overlay and overlay2 perform significantly better than device mapper and are recommended if the OS kernel and workload support it.

CGroup Support

YARN provides isolation through the use of cgroups. Docker also has cgroup management built in. If isolation through cgroups if desired, the only recommended solution is to use YARN’s cgroup management at this time. YARN will create the cgroup hierarchy and set the the --cgroup-parent flag when launching the container.

For more information about setting YARN cgroups, see Enabling cgroups.

The cgroupdriver must be set to cgroupfs. You must ensure that Docker is running using the --exec-opt native.cgroupdriver=cgroupfs docker daemon option.

Note

The Docker version included with RHEL/CentOS 7.2+ sets the cgroupdriver to systemd. You must change this, typically in the docker.service systemd unit file.


vi /usr/lib/systemd/system/docker.service

Find and fix the cgroupdriver:


--exec-opt native.cgroupdriver=cgroupfs \

Also, this version of Docker may include oci-hooks that expect to use the systemd cgroupdriver. Search for oci on your system and remove these files. For example:


rm -f /usr/libexec/oci/hooks.d/oci-systemd-hook
rm -f /usr/libexec/oci/hooks.d/oci-register-machine

Networking

YARN has support for running Docker containers on a user specified network, however, it does not manage the Docker networks. Administrators are expected to create the networks prior to running the containers. Node labels can be used to isolate particular networks. It is vital to read and understand the Docker networking documentation. Swarm based options are not recommended, however, overlay networks can be used if setup using an external store, such as etcd.

YARN will ask Docker for the networking details, such as IP address and hostname. As a result, all networking types are supported. Set the environment variable YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK to specify the network to use.

Host networking is only recommended for testing. If the network where the NodeManagers are running has a sufficient number of IP addresses. The bridge networking with --fixed-cidr option works well. Each NodeManager is allocated a small portion of the larger IP space, and then allocates those IP addresses to containers.

To use an administrator defined network, add the network to docker.allowed.networks in container-executor.cfg and yarn.nodemanager.runtime.linux.docker.allowed-container-networks in yarn-site.xml.

Image Management

Images can be preloaded on all NodeManager hosts or they can be implicitly pulled at runtime if they are available in a public Docker registry, such as Docker hub. If the image does not exist on the NodeManager and cannot be pulled, the container will fail.

Docker Bind Mounted Volumes

	Note
	Care should be taken when enabling this feature. Enabling access to directories such as, but not limited to, /, /etc, /run, or /home is not advisable and can result in containers negatively impacting the host or leaking sensitive information.

Files and directories from the host are commonly needed within the Docker containers, which Docker provides through volumes. Examples include localized resources, Apache Hadoop binaries, and sockets. In order to make use of this feature, the following must be configured.

The administrator must define the volume whitelist in container-executor.cfg by setting docker.allowed.ro-mounts and docker.allowed.rw-mounts to the list of parent directories that are allowed to be mounted.

The application submitter requests the required volumes at application submission time using the YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS environment variable.

The administrator supplied whitelist is defined as a comma separated list of directories that are allowed to be mounted into containers. The source directory supplied by the user must either match or be a child of the specified directory.

The user supplied mount list is defined as a comma separated list in the form source:destination:mode. The source is the file or directory on the host. The destination is the path within the container where the source will be bind mounted. The mode defines the mode the user expects for the mount, which can be ro (read-only) or rw (read-write).