Cluster diagnostic collection
The HST agents capture, anonymize, and encrypt cluster diagnostic data, and then send it to the central HST server to coalesce into a single downloadable file called a "bundle". The HST agent processes are short-lived services that are started only for specific data capture tasks.
To provide the most complete picture of cluster utilization, HST agents must be installed on every node in the cluster. After an HST agent has captured the requested data from the host it is installed on, the process exits.
The following image illustrates the communication between HST agents and the HST server:
SmartSense anonymizes and encrypts the diagnostic information captured in the bundle. You can extend the anonymization process by adding your own rules.
There are two types of bundles: one for ad-hoc troubleshooting of support cases, and the other for proactive analysis and recommendations.
Support case troubleshooting bundles
Bundles captured for troubleshooting contain configuration and metrics for each node in the cluster, and logs for only the subset of services and hosts that you chose before initiating the capture process. Additionally, they may contain application logs if collection is for a YARN application or a Hive query. The purpose of these bundles is to provide support engineers with basic diagnostic information that can help them understand the state of your cluster so that they can troubleshoot and quickly resolve issues.
Proactive analysis bundles
Bundles captured for analysis contain configuration and metrics for each node in the cluster, but do not contain any logs. Their purpose is to produce recommendations for changing your cluster configuration to ensure better security, performance, and operations. These recommendations are available in the SmartSense View in Ambari web UI and in the SmartSense tab on the Hortonworks support portal.