If your installation specified to use Tez for Hive, in the cluster.properties IS_TEZ=yes, after deployment perform the following steps as the hadoop user "hadoop":
Open the command prompt with the hadoop account:
runas /user:hadoop cmd
Make a Tez application directory in HDFS:
%HADOOP_HOME%\bin\hadoop.cmd dfs -mkdir /apps/tez
Allow all users read and write access:
%HADOOP_HOME%\bin\hadoop.cmd dfs -chmod -R 755 /apps/tez
Change the owner of the file to hadoop:
%HADOOP_HOME%\bin\hadoop.cmd dfs -chown -R hadoop:users /apps/tez
Copy the Tez home directory on the local machine into the HDFS
/apps/tez
directory:%HADOOP_HOME%\bin\hadoop.cmd dfs -put %TEZ_HOME%* /apps/tez
Remove the Tez configuration directory from the HDFS Tez application directory:
%HADOOP_HOME%\bin\hadoop.cmd dfs -rmr -skipTrash /apps/tez/conf3
Ensure that the following properties are set in the
hive-site.xml
:Table 6.1. Hive site configuration for Tez
Property Default Value Description hive.auto.convert.join.noconditionaltask true Specifies whether Hive optimizes converting common JOIN statements into MAPJOIN statements. JOIN statements are converted if this property is enabled and the sum of size for n-1 of the tables/partitions for an n-way join is smaller than the size specified with the hive.auto.convert.join.noconditionaltask.size property. hive.auto.convert.join.noconditionaltask.size 10000000 (10 MB) Specifies the size used to calculate whether Hive converts a JOIN statement into a MAPJOIN statement. The configuration property is ignored unless hive.auto.convert.join.noconditionaltask is enabled. hive.optimize.reducededuplication.min.reducer 4 Specifies the minimum reducer parallelism threshold to meet before merging two MapReduce jobs. However, combining a mapreduce job with parallelism 100 with a mapreduce job with parallelism 1 may negatively impact query performance even with the reduced number of jobs. The optimization is disabled if the number of reducers is less than the specified value. hive.tez.container.size -1 By default, Tez uses the java options from map tasks. Use this property to override that value. Assigned value must match value specified for mapreduce.map.child.java.opts. hive.tez.java.opts N/A Java command line options for Tez. Must be assigned the same value as mapreduce.map.child.java.opts. Note Adjust the settings above to your environment where appropriate.
Verify the install succeeded by running smoke tests for tez and hive.