Chapter 1. Using Data Integration Services Powered by Talend

Talend Open Studio for Big Data is a powerful and versatile open source data integration solution. It enables an enterprise to work with existing data and existing systems, and use Hadoop to power large scale data analysis across the enterprise.

Talend Open Studio (TOS) is distributed as an add-on for Hortonworks Data Platform (HDP). TOS uses the following HDP components:

  • Enable users to read/write from/to Hadoop as a data source/sink.

  • HCatalog Metadata services enable users to import raw data into Hadoop (HBase and HDFS), create, and manage schemas.

  • Pig and Hive for analyzing these data sets.

  • Enable users to schedule these ETL jobs on a recurring basis on a Hadoop Cluster using Oozie.

This document includes the following sections:

For more information on Talend Open Studio, see Talend Open Studio v5.3 Documentation.