15. Configure and Start Apache Hive and Apache HCatalog

  1. Copy the jdbc connector jar from OLD_HIVE_HOME/lib to CURRENT_HIVE_HOME/lib.

  2. Upgrade the Hive Metastore database schema. Restart the Hive Metastore database and run:

    /usr/hdp/current/hive-metastore/bin/schematool -upgradeSchema -dbType <$databaseType>

    The value for $databaseType can be derby, mysql, oracle or postgres.

    [Note]Note

    If you are using Postgres 8 and Postgres 9, you should reset the Hive Metastore database owner to <HIVE_USER>:

    sudo <POSTGRES_USER>
    ALTER DATABASE <HIVE-METASTORE-DB-NAME> OWNER TO <HIVE_USER>
    [Note]Note

    If you are using Oracle 11, you may see the following error message:

    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.optimize.mapjoin.mapreduce does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.heapsize does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
    14/11/17 14:11:38 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.noconditionaltask does not exist
    Metastore connection URL: jdbc:oracle:thin:@//ip-172-31-42-1.ec2.internal:1521/XE
    Metastore Connection Driver : oracle.jdbc.driver.OracleDriver
    Metastore connection User: hiveuser
    Starting upgrade metastore schema from version 0.13.0 to 0.14.0
    Upgrade script upgrade-0.13.0-to-0.14.0.oracle.sql
    Error: ORA-00955: name is already used by an existing object (state=42000,code=955)
    Warning in pre-upgrade script pre-0-upgrade-0.13.0-to-0.14.0.oracle.sql: Schema script failed, errorcode 2
    Completed upgrade-0.13.0-to-0.14.0.oracle.sql
    schemaTool completed

    You can safely ignore this message. The error is in the pre-upgrade script and can be ignored; the schematool succeeded.

    [Note]Note

    Copy only the necessary configuration files. Do not copy the env.sh files, for example, hadoop-env.sh, hive-env.sh, and so forth. Additionally, all env.sh files must be properly configured.

  3. Download and extract the HDP companion files.

    Copy the hive-site.xml file in the configuration_files/hive directory of the extracted companion files to the etc/hive/conf directory on your Hive host machine. This new version of the hive-site.xml file contains new properties for HDP-2.2 features.

  4. Edit the hive-site.xml file and modify the properties based on your environment. Search for TODO in the file for the properties to replace.

    1. Edit the connection properties for your Hive metastore database in hive-site.xml:

      <property>
       <name>javax.jdo.option.ConnectionURL</name>
       <value>jdbc:mysql://TODO-HIVE-METASTORE-DB-SERVER:
         TODO-HIVE-METASTORE-DB-PORT/TODO-HIVE-METASTORE-DB-NAME
         ?createDatabaseIfNotExist=true</value>
       <description>Enter your Hive Metastore Connection URL, 
         for example if MySQL: jdbc:mysql://localhost:3306/mysql
         ?createDatabaseIfNotExist=true</description> 
      </property>
       
      <property>
       <name>javax.jdo.option.ConnectionUserName</name>
       <value>TODO-HIVE-METASTORE-DB-USER-NAME</value>
       <description>Enter your Hive Metastore database user name.</description>
      </property>
       
      <property> 
       <name>javax.jdo.option.ConnectionPassword</name> 
       <value>TODO-HIVE-METASTORE-DB-PASSWORD</value> 
       <description>Enter your Hive Metastore database password.</description>
      </property>
       
      <property>
       <name>javax.jdo.option.ConnectionDriverName</name>
       <value>TODO-HIVE-METASTORE-DB-CONNECTION-DRIVER-NAME</value>
       <description>Enter your Hive Metastore Connection Driver Name, 
         for example if MySQL: com.mysql.jdbc.Driver</description>
      </property>
    2. Edit the following properties in the hive-site.xml file:

      <property>
       <name>fs.file.impl.disable.cache</name>
       <value>false</value>
       <description>Set to false or remove fs.file.impl.disable.cache</description> 
      </property>
       
      <property>
       <name>fs.hdfs.impl.disable.cache</name>
       <value>false</value>
       <description>Set to false or remove fs.hdfs.impl.disable.cache
       </description>
      <property>
    3. Optional: If you want Hive Authorization, set the following Hive authorization parameters in the hive-site.xml file:

      <property>
       <name>hive.security.authorization.enabled</name>
       <value>true</value>
      </property>
       
      <property>
       <name>hive.security.authorization.manager</name>
       <value>org.apache.hadoop.hive.ql.security.authorization.
         StorageBasedAuthorizationProvider</value>
      </property>
       
      <property>
       <name>hive.security.metastore.authorization.manager</name>
       <value>org.apache.hadoop.hive.ql.security.authorization.
         StorageBasedAuthorizationProvider</value>
      </property>
       
      <property>
       <name>hive.security.authenticator.manager</name>
       <value>org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator</value>
      </property>
    4. For a remote Hive metastore database, set the IP address (or fully-qualified domain name) and port of the metastore host using the following hive-site.xml property value.

      (To enable HiveServer2, leave this property value empty.)

      <property> 
       <name>hive.metastore.uris</name> 
       <value>thrift://$metastore.server.full.hostname:9083</value> 
       <description>URI for client to contact metastore server. 
         To enable HiveServer2, leave the property value empty. 
         </description>
      </property>

      You can further fine-tune your configuration settings based on node hardware specifications, using the HDP utility script.

  5. Start Hive.

    On the Hive Metastore host machine, run the following command:

    su - hive

    nohup /usr/hdp/current/hive-metastore/bin/hive --service metastore>/var/log/hive/hive.out 2>/var/log/hive/hive.log &

  6. Start Hive Server2.

    On the Hive Server2 host machine, run the following command:

    su - hive

    nohup /usr/hdp/current/hive-server2/bin/hiveserver2 -hiveconf hive.metastore.uris=" " -hiveconf hive.log.file=hiveserver2.log >/var/log/hive/hiveserver2.out 2> /var/log/hive/hiveserver2err.log &

    where $HIVE_USER is the Hive Service user. For example, hive.


loading table of contents...