Using Apache Storm to Move Data
Also available as:
PDF

HDFS Spout Example

The following example creates an HDFS spout that reads text files from HDFS path hdfs://localhost:54310/source.

// Instantiate spout to read text files
HdfsSpout textReaderSpout = newHdfsSpout().setReaderType("text")
                                          .withOutputFields(TextFileReader.defaultFields)                                      
                                          .setHdfsUri("hdfs://localhost:54310")  // reqd
                                          .setSourceDir("/data/in")              // reqd                                      
                                          .setArchiveDir("/data/done")           // reqd
                                          .setBadFilesDir("/data/badfiles");     // required                                      

// If using Kerberos
HashMap hdfsSettings = new HashMap();
hdfsSettings.put("hdfs.keytab.file", "/path/to/keytab");
hdfsSettings.put("hdfs.kerberos.principal","user@EXAMPLE.com");

textReaderSpout.setHdfsClientSettings(hdfsSettings);

// Create topology
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("hdfsspout", textReaderSpout, SPOUT_NUM);

// Set up bolts and wire up topology
   ...

// Submit topology with config
Config conf = new Config();
StormSubmitter.submitTopologyWithProgressBar("topologyName", conf, builder.createTopology());

A sample topology HdfsSpoutTopology is provided in the storm-starter module.