4. Packaging Storm Topologies

Storm developers should verify that the following conditions are met when packaging their topology into a .jar file:

  • Use the maven-shade-plugin, rather than the maven-assembly-plugin to package your Apache Storm topologies. The maven-shade-plugin provides the ability to merge JAR manifest entries, which are used by the Hadoop client to resolve URL schemes.

  • Include a dependency for the Hadoop version used in the Hadoop cluster.

  • Include both the hdfs-site.xml and core-site.xml configuration files in the .jar file. This is the easiest way to meet the requirement that these two files are in the CLASSPATH of your topology.

Maven Shade Plugin

Use the following Maven configuration file to package your topology:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>1.4</version>
    <configuration>
        <createDependencyReducedPom>true</createDependencyReducedPom>
    </configuration>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <transformers>
                    <transformer
                            implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                    <transformer
                            implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                        <mainClass></mainClass>
                    </transformer>
                </transformers>
            </configuration>
        </execution>
    </executions>
</plugin>

Hadoop Dependency

The following example demonstrates how to include a dependency for the Hadoop version:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.2.0</version>
    <exclusions>
        <exclusion>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
        </exclusion>
    </exclusions>
</dependency>

Troubleshooting

The following table describes common packaging errors.

 

Table 2.6. Topology Packing Errors

ErrorDescription
com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero)Hadoop client version incompatibility
java.lang.RuntimeException: Error preparing HdfsBolt: No FileSystem for scheme: hdfsThe .jar manifest files are not properly merged in the topology .jar