YARN Cluster Mode Configuration
In the YARN cluster mode configuration, YARN schedules and runs the Spark job that a user submits to the cluster. The ApplicationMaster hosts the Spark driver that is launched within the Docker container.
The following image provides an overview of the cluster mode configuration:
–deploy-mode=cluster. In addition, you can set the container configurations for the Spark driver or the ApplicationMaster along with the executor’s container configurations through environment variables during submission.
The container image for the driver can be different from that for the executor.
spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=<docker-image> spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=/etc/passwd:/etc/passwd:ro