Chapter 9. HDP Logfiles

The following sections detail HDP logging and Backups:

 1. HDP Log Locations

Each component of the HDP ecosystem creates logs from each of the daemons under normal operation, as well as configuration logs, statistics, standard error, standard out, and internal diagnostic information.

The following table for HDP 1.3 lists the types of logs and where you can find them. The locations depend on the user that owns the process. For example, $hdfs represents the system user that owns the HDFS component daemons.

 

Table 9.1. Dependencies on HDP components

ComponentLog TypeLog Location (defaults)

HDFS

NameNode operational logs

On NameNode - /var/log/hadoop/$hdfs

HDFS

DataNode operational logs

On DataNodes - /var/log/hadoop/$hdfs

MapReduce (HDP 1.x)

Job Tracker operational logs

On JobTracker - /var/log/hadoop/$mapred

MapReduce (HDP 1.x)

Task Tracker operational logs

On TaskTrackers - /var/log/hadoop/$mapred

MapReduce (HDP 1.x)

MapReduce Job logs

On TaskTrackers - /var/log/hadoop/$mapred/userlogs

Hive

HiveServer2 operational logs

On HiveServer2 - /var/log/hive/hive-server2.*

Hive

Hive Metastore operational logs

On Hive Metastore - /var/log/hive/hive.*

WebHCat

WebHCat Server operational logs

On WebHCat server - /var/log/webhcat

Oozie

Oozie Server operational logs

On Oozie Server - /var/log/oozie

HBase

HBase Master operational logs

On HBase Master - /var/log/hbase

HBase

Region Server operational logs

On RegionServers - /var/log/hbase

ZooKeeper

ZooKeeper operational logs

On ZooKeeper nodes - /var/log/zookeeper

Nagios

Nagios Server operational logs

On the Nagios Server - /var/log/nagios Log Archive Path - /var/log/nagios/archives

Ganglia

Ganglia Server

On the Ganglia Server - /var/log/messages

Ganglia

Ganglia Monitor

On Ganglia Monitor nodes - /var/log/messages

Ambari

Ambari Server

On Ambari Server - /var/log/ambari-server

Ambari

Ambari Agent

On Ambari Server - /var/log/ambari-server


 2. HDP Log Format

Hadoop and the other HDP components use Log4J to format their log files, and to filter what information is logged. The log format and filter can be configured for each component by editing the corresponding log4j.properties file. These files are found in the configuration folders for each component. The following log4j severity levels are available to filter logs:

  • OFF- Highest possible rank. Turns off all logging.

  • FATAL - Application will probably abort due to these errors. Displays on the status console.

  • ERROR - Runtime errors or unexpected conditions. Application may continue to run. Displays on console.

  • WARN - Undesirable or unexpected conditions. Potentially danerous conditions. Displays on console.

  • INFO - Interesting messages on application progress. Displays on console.

  • DEBUG - Detailed information on events to debug an application. Written to logs only.

  • TRACE - More detailed information than DEBUG to help debug an application. Written to logs only.

  • ALL - Lowest possible rank. Turns on all logging.

[Note]Note

If you do not assign a logging level to the logger, the logger inherits the first non-null level in the hirearchy. By default, the root logger is at the DEBUG logging level.

 3. HDP Backups

You can back up your HDP clusters to an external backup HDP cluster using distcp, a bulk data movement tool. Invoke distcp periodically to transfer HDFS datasets from the active HDP cluster to the backup HDP cluster.

For more information on backing up clusters with distcp, visit the distcp community page here.


loading table of contents...