2. Decommission DataNodes or TaskTrackers

Nodes normally run both a DataNode and a TaskTracker, and both are typically commissioned or decommissioned together.

With the replication level set to three, HDFS is resilient to individual DataNodes failures. However, there is a high chance of data loss when you terminate DataNodes without decommissioning them first. Nodes must be decommissioned on a schedule that permits replication of blocks being decommissioned.

On the other hand, if a TaskTracker is shutdown, the JobTracker will schedule the tasks on other TaskTrackers. However, decommissioning a TaskTracker is required especially in situations where you want that TaskTracker to stop to accepting new tasks or when the tasks take time to execute but you still want to be agile in your cluster management.


loading table of contents...