4. Create Directories

Create directories and configure ownership + permissions on the appropriate hosts as described below.

[Note]Note

If any of these directories already exist, we recommend deleting and recreating them.

The scripts.zip file you downloaded in  Download Companion Files includes two scripts, usersAndGroups.sh and directories.sh, for setting environment parameters. We strongly suggest you edit and execute these scripts to fit your environment.

 4.1. Create the NameNode Directories

On the node that hosts the NameNode service, execute the following commands:

mkdir -p $DFS_NAME_DIR
chown -R $HDFS_USER:$HADOOP_GROUP $DFS_NAME_DIR
chmod -R 755 $DFS_NAME_DIR

 4.2. Create the SecondaryNameNode Directories

On all that nodes that can potentially run the SecondaryNameNode service, execute the following commands:

mkdir -p $FS_CHECKPOINT_DIR
chown -R $HDFS_USER:$HADOOP_GROUP $FS_CHECKPOINT_DIR
chmod -R 755 $FS_CHECKPOINT_DIR

 4.3. Create the DataNode and MapReduce Local Directories

On all DataNodes, execute the following commands:

mkdir -p $DFS_DATA_DIR
chown -R $HDFS_USER:$HADOOP_GROUP $DFS_DATA_DIR
chmod -R 750 $DFS_DATA_DIR

On the JobTracker and all Datanodes, execute the following commands:

mkdir -p $MAPREDUCE_LOCAL_DIR
chown -R $MAPRED_USER:$HADOOP_GROUP $MAPREDUCE_LOCAL_DIR
chmod -R 755 $MAPREDUCE_LOCAL_DIR

 4.4. Create the Log and PID Directories

On all nodes, execute the following commands:

mkdir -p $HDFS_LOG_DIR
chown -R $HDFS_USER:$HADOOP_GROUP $HDFS_LOG_DIR
chmod -R 755 $HDFS_LOG_DIR
mkdir -p $MAPRED_LOG_DIR
chown -R $MAPRED_USER:$HADOOP_GROUP $MAPRED_LOG_DIR
chmod -R 755 $MAPRED_LOG_DIR
mkdir -p $HDFS_PID_DIR
chown -R $HDFS_USER:$HADOOP_GROUP $HDFS_PID_DIR
chmod -R 755 $HDFS_PID_DIR
mkdir -p $MAPRED_PID_DIR
chown -R $MAPRED_USER:$HADOOP_GROUP $MAPRED_PID_DIR
chmod -R 755 $MAPRED_PID_DIR

loading table of contents...