1. Set up Tez for Hive

If your installation specified to use Tez for Hive, in the cluster.properties IS_TEZ=yes, after deployment perform the following steps as the hadoop user "hadoop":

  1. Open the command prompt with the hadoop account:

    runas /user:hadoop cmd
  2. Make a Tez application directory in HDFS:

    %HADOOP_HOME%\bin\hadoop.cmd dfs -mkdir /apps/tez
  3. Allow all users read and write access:

    %HADOOP_HOME%\bin\hadoop.cmd dfs -chmod -R 755 /apps/tez
  4. Change the owner of the file to hadoop:

    %HADOOP_HOME%\bin\hadoop.cmd dfs -chown -R hadoop:users /apps/tez
  5. Copy the Tez home directory on the local machine into the HDFS /apps/tez directory:

    %HADOOP_HOME%\bin\hadoop.cmd dfs -put %TEZ_HOME%* /apps/tez
  6. Remove the Tez configuration directory from the HDFS Tez application directory:

    %HADOOP_HOME%\bin\hadoop.cmd dfs -rmr -skipTrash /apps/tez/conf3
  7. Ensure that the following properties are set in the hive-site.xml:


    Table 6.1. Hive site configuration for Tez

    PropertyDefault ValueDescription
    hive.auto.convert.join.noconditionaltasktrueSpecifies whether Hive optimizes converting common JOIN statements into MAPJOIN statements. JOIN statements are converted if this property is enabled and the sum of size for n-1 of the tables/partitions for an n-way join is smaller than the size specified with the hive.auto.convert.join.noconditionaltask.size property.
    hive.auto.convert.join.noconditionaltask.size10000000 (10 MB)Specifies the size used to calculate whether Hive converts a JOIN statement into a MAPJOIN statement. The configuration property is ignored unless hive.auto.convert.join.noconditionaltask is enabled.
    hive.optimize.reducededuplication.min.reducer4Specifies the minimum reducer parallelism threshold to meet before merging two MapReduce jobs. However, combining a mapreduce job with parallelism 100 with a mapreduce job with parallelism 1 may negatively impact query performance even with the reduced number of jobs. The optimization is disabled if the number of reducers is less than the specified value.
    hive.tez.container.size-1By default, Tez uses the java options from map tasks. Use this property to override that value. Assigned value must match value specified for mapreduce.map.child.java.opts.
    hive.tez.java.optsN/AJava command line options for Tez. Must be assigned the same value as mapreduce.map.child.java.opts.


    Adjust the settings above to your environment where appropriate.

Verify the install succeeded by running smoke tests for tez and hive.

loading table of contents...