Workload dashboard (MR, Pig, Tez, Hive)
The Workload dashboard (MR, Pig, Tez, Hive) provides key information about workloads that use MapReduce or Tez for execution.
This dashboard includes the following paragraphs:
Longest Running Jobs
Most Resource Intensive Jobs
Most Resource Wasting Jobs
Workloads With Highest HDFS Operations
Workloads Creating Max HDFS Files
Workloads With Largest HDFS Writes
Workloads With Highest CPU Consumption
Workloads With Most Inefficient Data Read
Workloads With Most Input Data Explosion
Job Distribution By Type
Job Submission Trend By Day.Hour
Most of these paragraphs have titles that are self-explanatory. A few of them are described below to provide more context:
|Most Resource Wasting Jobs||
Resource wasting is calculated by calculating the difference between the memory asked for and the memory that was actually used.
For example, if a job asks for 100 8GB containers but only uses 5GB per container, 3GB per container is considered wasted. This is calculated per job, and the top 10 are listed.
|Job Submission Trend By Day.Hour||
This paragraph shows the number of jobs submitted by day and hour with the notation being <day>.<hour>. For example:
• Monday.1 - 1am on Monday
• Monday.20 - 8pm on Monday
The goal of this dashboard is to identify specific job submission hotspots during the week and day. You can use this information to identify the best time to schedule resource intensive jobs to execute.