Using constraints
You can use DEFAULT, PRIMARY KEY, FOREIGN KEY, and NOT NULL constraints in Hive ACID table definitions to improve the performance, accuracy, and reliability of data.
The Hive engine and BI tools can simplify queries if data is predictable and easily located.
Hive enforces constraints as follows:
- DEFAULT
- Ensures a value exists, which is useful in EDW offload cases.
- PRIMARY KEY
- Identifies each row in a table using a unique identifier.
- FOREIGN KEY
- Identifies a row in another table using a unique identifier.
- NOT NULL
- Checks that a column value is not set to NULL.
CREATE TABLE Persons (
ID INT NOT NULL,
Name STRING NOT NULL,
Age INT,
Creator STRING DEFAULT CURRENT_USER(),
CreateDate DATE DEFAULT CURRENT_DATE(),
PRIMARY KEY (ID) DISABLE NOVALIDATE);
CREATE TABLE BusinessUnit (
ID INT NOT NULL,
Head INT NOT NULL,
Creator STRING DEFAULT CURRENT_USER(),
CreateDate DATE DEFAULT CURRENT_DATE(),
PRIMARY KEY (ID) DISABLE NOVALIDATE,
CONSTRAINT fk FOREIGN KEY (Head) REFERENCES Persons(ID) DISABLE NOVALIDATE
);