Create partitions dynamically

You can configure Hive to create partitions dynamically and then run a query that creates the related directories on the file system, such as HDFS or S3. Hive then separates the data into the directories.

This example assumes you have the following CSV file named employees.csv to use as the data source:

1,jane doe,engineer,service
2,john smith,sales rep,sales
3,naoko murai,service rep,service
4,somporn thong,ceo,sales
5,xi singh,cfo,finance

Put the CSV file on a file system, for example in HDFS at/user/hive/dataload/employee, and change permissions.

hdfs dfs -mkdir /user/hive/dataload/employee
hdfs dfs -chmod 777 /user/hive/dataload/employee
hdfs dfs -put employees.csv /user/hive/dataload/employee/employees.csv
hdfs dfs -chmod 777 /user/hive/dataload/employee/employees.csv

Launch Beeline and in the Hive shell, create an unpartitioned table that holds all the data.

CREATE EXTERNAL TABLE employees (eid int, name string, position string, dept string)
  ROW FORMAT DELIMITED
  FIELDS TERMINATED BY ','
  STORED AS TEXTFILE
  LOCATION '/user/hive/dataload/employee';

Check that the data loaded into the employees table.

SELECT * FROM employees;

The output, formatted to fit this publication, appears:


+------+---------------+-------------+-------+---------+
| eid  |    name       | position    | dept  |         |
+------+---------------+-------------+-------+---------|
| 1    | jane doe      | engineer    | service         |
| 2    | john smith    | sales rep   | sales           |
| 3    | naoko murai   | service rep | service         |
| 4    | somporn thong | ceo         | sales           |
| 5    | xi singh      | cfo         | finance         |
+------+---------------+-------------+-----------------+

Create a partition table.

CREATE EXTERNAL TABLE EMP_PART (eid int, name string, position string) 
  PARTITIONED BY (dept string);

Set the dynamic partition mode to create partitioned directories of data dynamically when data is inserted.
```
SET hive.exec.dynamic.partition.mode=nonstrict;
```
Insert data from the unpartitioned table (all the data) into the partitioned table , dynamically creating the partitions.
```
INSERT INTO TABLE EMP_PART PARTITION (DEPT)
  SELECT eid,name,position,dept FROM employees;
```
Partitions are created dynamically.

Check that the partitions were created.

SHOW PARTITIONS emp_part;

+----------------+
|   partition    |
+----------------+
| dept=finance   |
| dept=sales     |
| dept=service   |
+----------------+

Check the corresponding partition directories on HDFS.

hdfs dfs -ls /warehouse/tablespace/external/hive/emp_part

drwxrwxrwx+  - hive hadoop          0 2019-06-13 22:30 /warehouse/tablespace/external/hive/emp_part/dept=finance 
drwxrwxrwx+  - hive hadoop          0 2019-06-13 22:30 /warehouse/tablespace/external/hive/emp_part/dept=sales
drwxrwxrwx+  - hive hadoop          0 2019-06-13 22:30 /warehouse/tablespace/external/hive/emp_part/dept=service

You cannot remove the directories in the warehouse owned by hive.