Using Apache HiveQL
Also available as:
PDF

Use a materialized view in a subquery

You can create a materialized view for optimizing a subquery.

In this task, you create a materialized view and use it in a subquery to return the number of destination-origin pairs. Suppose the data resides in a table name flights_hdfs and has the following data:
c_id dest origin
1 Chicago Hyderabad
2 London Moscow
...
  1. Create a table schema definition named flights_hdfs for destination and origin data.
    CREATE TABLE flights_hdfs(
      c_id INT,
      dest VARCHAR(256),
      origin VARCHAR(256))
      STORED AS ORC;
  2. Create a materialized view that counts destinations and origins.
    CREATE MATERIALIZED VIEW mv1
    AS
      SELECT dest, origin, count(*)
      FROM flights_hdfs
      GROUP BY dest, origin;
  3. Use the materialized view in a subquery to return the number of destination-origin pairs.
    SELECT count(*)/2
    FROM(
      SELECT dest, origin, count(*)
      FROM flights_hdfs
      GROUP BY dest, origin
    ) AS t;