1. Create a Rack Topology Script

Topology scripts are used by Hadoop to determine the rack location of nodes. This information is used by Hadoop to replicate block data to redundant racks.

  1. Create a topology script and data file.

    Sample Topology Script

    File name: rack-topology.sh

    HADOOP_CONF=/etc/hadoop/conf 
    
    while [ $# -gt 0 ] ; do
      nodeArg=$1
      exec< ${HADOOP_CONF}/topology.data 
      result="" 
      while read line ; do
        ar=( $line ) 
        if [ "${ar[0]}" = "$nodeArg" ] ; then
          result="${ar[1]}"
        fi
      done 
      shift 
      if [ -z "$result" ] ; then
        echo -n "/default/rack "
      else
        echo -n "$result "
      fi
    done

    Sample Topology Data File

    File name: topology.data

    hadoopdata1.ec.com     /dc1/rack1
    hadoopdata1            /dc1/rack1
    10.1.1.1               /dc1/rack2
  2. Copy both of these files to /etc/hadoop/conf.

  3. Run the rack-topology.sh script to ensure that it returns the correct rack information for each host.


loading table of contents...