6. Set up Tez for Hive

If your installation specified to use Tez for Hive, in the cluster.properties IS_TEZ=yes, after deployment perform the following steps as the hadoop user "hadoop":

  1. Open the command prompt with the hadoop account:

    runas /user:hadoop cmd
  2. Make a Tez application directory in HDFS:

    %HADOOP_HOME%\bin\hadoop.cmd fs -mkdir /apps/tez
  3. Allow all users read and write access:

    %HADOOP_HOME%\bin\hadoop.cmd fs -chmod -R 755 /apps/tez
  4. Change the owner of the file to hadoop:

    %HADOOP_HOME%\bin\hadoop.cmd fs -chown -R hadoop:users /apps/tez
  5. Copy the Tez home directory on the local machine into the HDFS /apps/tez directory:

    %HADOOP_HOME%\bin\hadoop.cmd fs -put %TEZ_HOME%\* /apps/tez
  6. Remove the Tez configuration directory from the HDFS Tez application directory:

    %HADOOP_HOME%\bin\hadoop.cmd fs -rm -r -skipTrash /apps/tez/conf
  7. Ensure that the following properties are set in the %HIVE_HOME%\conf\hive-site.xml:

     

    Table 6.1. Hive site configuration for Tez

    PropertyDefault ValueDescription
    hive.auto.convert.join.noconditionaltasktrueSpecifies whether Hive optimizes converting common JOIN statements into MAPJOIN statements. JOIN statements are converted if this property is enabled and the sum of size for n-1 of the tables/partitions for an n-way join is smaller than the size specified with the hive.auto.convert.join.noconditionaltask.size property.
    hive.auto.convert.join.noconditionaltask.size10000000 (10 MB)Specifies the size used to calculate whether Hive converts a JOIN statement into a MAPJOIN statement. The configuration property is ignored unless hive.auto.convert.join.noconditionaltask is enabled.
    hive.optimize.reducededuplication.min.reducer4Specifies the minimum reducer parallelism threshold to meet before merging two MapReduce jobs. However, combining a mapreduce job with parallelism 100 with a mapreduce job with parallelism 1 may negatively impact query performance even with the reduced number of jobs. The optimization is disabled if the number of reducers is less than the specified value.
    hive.tez.container.size-1By default, Tez uses the java options from map tasks. Use this property to override that value. Assigned value must match value specified for mapreduce.map.child.java.opts.
    hive.tez.java.optsN/ASet to the same value as mapreduce.map.java.opts.


    [Note]Note

    Adjust the settings above to your environment where appropriate; the hive-default.xml.template contains examples of the properties.

Verify the install succeeded by running smoke tests for tez and hive.


loading table of contents...