1.3. Hive SQL on Tez - DAG, Vertex and Task

In Hive, the user query written in SQL is compiled and for execution converted into a Tez execution graph, or more precisely a Directed Acyclic Graph (DAG). A DAG is a collection of Vertices where each Vertex executes a part, or fragment of the user Query. The directed connections between Vertices determine the order in which they are executed. For example, the Vertex to read a table has to be run before a filter can be applied to the rows of that table.

Let’s say that a Vertex reads a user table. This table can be very large and distributed across multiple machines and multiple racks. So, this table read is achieved by running many tasks in parallel. Here is a simplified example using a sample query that shows the execution of a SQL query in Hive.

Executing a SQL query in Hive

The Tez View tool lets your more easily understand and debug any submitted Tez job. Examples of Tez jobs include: a Hive query or Pig script executed using the Tez execution engine. Specifically, Tez helps you do the following tasks:


loading table of contents...