Managing Data Operating System
Also available as:
PDF
loading table of contents...

YARN Cluster Mode Configuration

In the YARN cluster mode configuration, YARN schedules and runs the Spark job that a user submits to the cluster. The ApplicationMaster hosts the Spark driver that is launched within the Docker container.

The following image provides an overview of the cluster mode configuration:

During application submission, you must specify –deploy-mode=cluster. In addition, you can set the container configurations for the Spark driver or the ApplicationMaster along with the executor’s container configurations through environment variables during submission.
Note
Note
The container image for the driver can be different from that for the executor.
You can configure the additional settings for the driver as follows:
spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker

spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=<docker-image>

spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=/etc/passwd:/etc/passwd:ro