Managing Data Operating System
Also available as:
PDF
loading table of contents...

Using Docker Containers on YARN for Spark Jobs

Apache Spark applications might have complex software dependencies which introduce package isolation challenges, especially in situations when you have to install multiple versions of the dependencies on cluster hosts where Spark executors run. Docker containers on YARN address such package isolation challenges by enabling you to install and manage the software dependencies as separate images on the containers.