2. Enable Tez AM

Use the following instructions to enable Tez AM:

  1. On all the client nodes and Tez Service host machine, edit /etc/tez/conf/tez-env.sh file and modify the following environment variables:

    export HADOOP_HOME="$HADOOP_HOME"
    export JAVA_HOME="$JAVA_HOME"       

    where

    • $HADOOP_HOME is the location of the directory that contains all core Hadoop JAR files. For example, /usr/lib/hadoop.

    • $JAVA_HOME is the location of the directory that contains JDK.

  2. Ensure that the /$HADOOP_HOME/bin/hadoop file exists on the Tez Service host machine.

  3. On all the client nodes and Tez host machine, edit mapred-site.xml and modify the following properties:

    1. Enable Tez AM:

      <property>    
          <name>mapreduce.framework.name</name>    
          <value>yarn-tez</value>    
          <description>Name of the MapReduce framework. Default value is yarn.</description>  
      </property> 
    2. Set MapReduce CLASSPATH to a CLASSPATH that contains all the Tez JAR files:

      <property>    
          <name>mapreduce.application.classpath</name>    
          <value>$TEZ_HOME/*,$TEZ_HOME/lib/*</value>    
          <description>Classpath for MapReduce applications.</description>  
      </property> 

      where $TEZ_HOME is the location of the directory that contains all the Tez JAR files. By default, $TEZ_HOME is set to /usr/lib/tez.

    3. Enable container reuse across task attempts:

      <property>    
          <name>yarn.app.mapreduce.am.scheduler.reuse.enable</name>    
          <value>true</value>    
          <description>Enable container reuse across task attempts. Default is set to false.</description>  
      </property>
    4. Define number of task attempts to be run on a single container before the container is released. Use -1 to disable this limit.

      <property>    
          <name>yarn.app.mapreduce.am.scheduler.reuse.max-attempts-per-container</name>    
          <value>-1</value>    
          <description>Defines number of task attempts to be run on a single container before the container is
                  released. To disable this limit, set the value of this property to -1.</description>  
      </property> 

      [Note]Note

      For certain workloads, some jobs tend to have memory leaks and so we recommend that you set the container reuse property to a manageable value (for example 5 or 10).

  4. On all the client nodes and Tez hostmachine, edit hadoop-env.sh and set HADOOP_CLASSPATH as shown below:

    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_HOME/*:$TEZ_HOME/lib/*

    where, $TEZ_HOME is the location of the directory that contains all the Tez JAR files. By default, $TEZ_HOME is set to /usr/lib/tez.


loading table of contents...