Migrating data
Also available as:
PDF

Import RDBMS data into Hive

You can test the Apache Sqoop import command and then execute the command to import relational database tables into Hive.

You enter the Sqoop import command on the command line of your Hive cluster to import data from a data source to Hive. You can test the import statement before actually executing it.
  • Apache Sqoop is installed and configured.
  • A Hive Metastore is associated with your HDFS cluster.
  1. Optionally, test the import command before execution using the eval option.
    sqoop eval --connect jdbc:mysql://db.foo.com/bar \
    --query "SELECT * FROM employees LIMIT 10"
    The output of the select statement appears listing 10 rows of data from the RDBMS employees table.
  2. Execute a Sqoop import command that specifies the Sqoop connection to the RDBMS, the data you want to import, and the destination Hive table name.
    This command imports the MySQL EMPLOYEES table to a new Hive table named in the default HDFS location /user/hive/warehouse.
    sqoop import --connect jdbc:mysql://db.foo.com:3306/corp \
    --table EMPLOYEES \
    --hive-import \
    --create-hive-table \
    --hive-table mydb.newtable