Workflow Management
Also available as:
PDF
loading table of contents...

Understanding the Design Component

You can create and run workflows, coordinators, and bundles from Workflow Manager.

You create a workflow in Workflow Manager using a graphing flow tool. The type of graph you create is known as a directed acyclic graph (DAG). This type of graph has a single direction and can never be cyclic, so a particular action node can never be referenced by more than one other node in a graph. A workflow must begin with a start node, followed by one or more action nodes, and end with an end node. Other control nodes or action nodes can be included.

An action node represents an Oozie action in the workflow, such as a Hive, Sqoop, Spark, Pig, or Java action, among others. Control nodes direct the execution flow in the workflow. Control nodes include the start and end nodes, as well as the fork and decision nodes. In Workflow Manager, the start and end nodes are preconfigured and cannot be modified.

A workflow succeeds when it reaches the end node. If a workflow fails, it transitions to a kill node and stops. The workflow reports the error message that you specify in the message element in the workflow definition.

Using a coordinator, you can assign a schedule and frequency to a workflow. You can also identify specific events that trigger start and end actions in the workflow. If you want to manage multiple recurring workflow jobs as a group, you must add each workflow to a coordinator, then add the coordinators to a bundle.

When you save, validate, or submit a workflow, coordinator, or bundle, they are each saved in their own XML file, also called an application file.

Before a workflow, coordinator, or bundle job can be executed, as a first step the job XML file and all other files necessary to run components of the workflow must be deployed to HDFS. This might include JAR files for Map/Reduce jobs, shells for streaming Map/Reduce jobs, native libraries, Pig scripts, and other resource files. These files are often packaged together as a ZIP file and referred to as a workflow application.

For a practical example of using Workflow Manager to create and monitor a simple Extract-Transform-Load workflow, see "Sample ETL Use Case".

More Information

See "Workflow Manager Design Component" in Workflow Manager Basics.