Chapter 4. Using Data Integration Services Powered by Talend

Talend Open Studio for Big Data is a powerful and versatile open-source data integration solution. It enables an enterprise to work with existing data and systems, and use Hadoop to power large-scale data analysis across the enterprise.

Talend Open Studio (TOS) is distributed as an add-on for the Hortonworks Data Platform (HDP). TOS uses the following HDP components:

  • Enables users to read/write from/to Hadoop as a data source/sink.

  • HCatalog Metadata services enables users to import raw data into Hadoop (HBase and HDFS), and to create and manage schemas.

  • Pig and Hive are used to analyze these data sets.

  • Enables users to schedule these ETL jobs on a recurring basis on a Hadoop Cluster using Oozie.

This document includes the following sections:

For more information on Talend Open Studio, see Talend Open Studio v5.3 Documentation.