The HDFS dashboard helps operators better understand how HDFS is being used and which users and jobs are consuming the most resources within the file system.
This dashboard includes the following paragraphs:
File Size Distribution
Users with Maximum Small Files
- Users with Maximum HDFS Utilization
HDFS File Size Trend
HDFS Utilization Trend
Most of these paragraphs have titles that are self-explanatory. A few of them are described below to provide more context:
|File Size Distribution||
For any large multi-tenant cluster, it’s important to identify and keep the proliferation of small files in check. The paragraph displays a pie chart showing the relative distribution of files by file size categorized by Tiny (0-10K), Mini (10K-1M), Medium (30M-128M), and Large (128M+) files.
The goal is to show how dominant specific file size categories are within HDFS. If there are many small files, you can easily identify (in the next paragraph) who is contributing to those small files.
|Users with Maximum Small Files||
Understanding how prevalent files of specific sizes are is helpful, but the next step is understanding who is responsible for creating those files. The goal of this paragraph is to show who is responsible for creating the majority of small files within HDFS.