Scaling Namespaces and Optimizing Data Storage
Enable disk IO statistics

Disk IO statistics are disabled by default. To enable disk IO statistics, you must set the file IO sampling percentage to a non-zero value in the hdfs-site.xml file.

  1. Set the dfs.datanode.fileio.profiling.sampling.percentage property to a non-zero value in hdfs-site.xml.
    Sampling disk IO might have a minor impact on cluster performance.
  2. Access the disk IO statistics from the NameNode JMX page at http://<namenode_host>:50070/jmx.
    In the following JMX output example, the time unit is milliseconds, and the disk is healthy because the IO latencies are low:
        "name" : "Hadoop:service=DataNode,name=DataNodeVolume-/data/disk2/dfs/data/",
        "modelerType" : "DataNodeVolume-/data/disk2/dfs/data/",
        "tag.Context" : "dfs",
        "tag.Hostname" : "",
        "TotalMetadataOperations" : 67,
        "MetadataOperationRateAvgTime" : 0.08955223880597014,
        "WriteIoRateNumOps" : 7321,
        "WriteIoRateAvgTime" : 0.050812730501297636