Installation
Also available as:
PDF

Installing SmartSense with Ambari

Installing SmartSense with Ambari involves the following steps:

Adding the SmartSense Service

Before you start the installation:

  • You should know your SmartSense ID and account name (both are available in the Hortonworks support portal in the Tools tab).

  • You must also ensure that an Ambari agent is running on the same host as the Ambari server.

To begin the installation, follow these steps:

  1. If you are using Ambari version 2.4.x, and want to use SmartSense 1.4.x, you must first download and install it (Downloading and Installing SmartSense Binary).

  2. From the Ambari web UI, select Add Service from the Actions drop-down menu.

  3. From the list of installable services, select SmartSense, and then click Next.

  4. On the Assign Masters page, select cluster nodes for the HST server, Activity Analyzer, and Activity Explorer, and then click Next.

    • For a list of criteria to determine the best node to select for HST server, see the HST Server Placement section.

    • For a list of criteria to determine the best nodes to select for Activity Analyzers see the Activity Analyzer Placement section.

  5. On the Customize Services page, validate the values in the following fields, as appropriate to your environment:

    Ambari 2.4+

    Note

    Configuration Tab: Basic

    Property: Customer account name

    Your account name, available from the Tools tab in Hortonworks Support Portal

    Configuration Tab: Basic

    Property: SmartSense ID

    Your SmartSense ID, available from the Tools tab in Hortonworks support portal

    Configuration Tab: Basic

    Property: Notification Email

    The email address notified when SmartSense bundles have been received and recommendations are ready for your review

    Configuration Tab: Basic

    Property: Enable Flex Subscription

    Use this option only if you have an existing Hortonworks Flex Support Subscription. You must enter your Flex Subscription ID.

    Configuration Tab: Basic

    Property: Bundle Storage Directory

    The directory on the HST server that will be used to store completed bundles

    Because bundles can be large, this directory should have at least 1GB of free space.

    Configuration Tab: Basic

    Property: Server Temporary Data Directory

    The directory on the HST server that is used to assemble results from HST Agents into completed bundles

    This directory must be large enough to handle the intermediate results of HST agent collection data: at least 5 GB of free space.

    Configuration Tab: Activity Analysis

    Property: Password for user 'admin'

    Password for the Activity Explorer admin user.

    Click Next.

    The Ambari Stack Advisor assesses your cluster configuration and might alert you to configuration issues. Note that this is not related to SmartSense, and is simply what Ambari does upon adding any service. SmartSense never makes configuration changes to your cluster. No cluster services need to be restarted after installing SmartSense, and any configuration changes that are noticed should be reverted.

    If you have a kerberized cluster, you will be prompted for the KDC admin credentials during this step.

  6. On the Review page, click Deploy to complete your SmartSense service installation.

    [Note]Note

    When Activity Analyzer is installed, Ambari may prompt to restart HDFS, YARN, and AMS services in order for Activity Analyzer to be able to communicate with these services.

Downloading and Installing SmartSense Binary

If you want to use SmartSense 1.4.x with Ambari version 2.4.x, you must first download it from the Tools tab of the Hortonworks support portal (https://support.hortonworks.com).

To install SmartSense, follow these steps:

  1. Install the SmartSense package on the Ambari server host:

    • RHEL, CentOS, or SLES:

      # rpm -ivh smartsense-hst-$HST_VERSION.x86_64.rpm
    • Debian or Ubuntu:

      # dpkg -i smartsense-hst_$HST_VERSION.deb
  2. Add SmartSense service to Ambari by running hst add-to-ambari.

    # hst add-to-ambari
    Enter SmartSense distributable path: /root/smartsense-hst-$HST_VERSION.x86_64.rpm
    Added SmartSense service definition to Ambari
    
    NOTE: It is required to restart Ambari Server for changes to reflect. Please restart ambari using 'ambari-server restart'
  3. Restart Ambari server by running ambari-server restart.

After you complete this task, you should read HST Server Placement, Activity Analyzer Placement, and follow the steps in Installing SmartSense with Ambari

HST Server Placement

You should designate one node in the HDP cluster as the HST server, so that this component can efficiently consolidate the data collected by all HST agents into a single downloadable file (referred to as a bundle). Any of the management nodes, such as Ambari Server, Metrics Server, and so on, are good choices for the HST server placement.

Administrators and each HST agent in the cluster must have network access to the HST server. This connectivity is required for agents to consolidate their data and for Hadoop administrators to download completed bundles. For a full list of ports and a data flow diagram, refer to SmartSense Ports & Traffic Flow.

Activity Analyzer Placement

The Activity Analyzer component has the ability to extract, aggregate, and store utilization data for all three supported analyzers: HDFS, YARN, and MapReduce & Tez. Before installing SmartSense, you should understand how and where to deploy and place these analyzers. You must install multiple Activity Analyzer instances, the exact number depending on which analyzers that you are planning to use and if HDFS is configured for NameNode HA.

[Note]Note

Activity Analyzers need HDFS, YARN, MR, and Tez clients installed on the same host as the analyzer.

HDFS Analyzer

For HDFS analysis, an Activity Analyzer needs to be deployed to each NameNode in the cluster. These instances will automatically begin processing the fsimage on startup and will reprocess the latest fsimage data once every 24 hours. By default, when deployed on a NameNode, these Activity Analyzers do not process YARN, or MapReduce & Tez utilization data; This is to reduce the amount of processing done on servers hosting critical services like the NameNode.

Resource requirements: HDFS Analyzer typically runs for a very short period of time, its resource consumption depending on fsImage size. For example, analyzing a 200-million-object fsImage is anticipated to take less than 15 minutes; HDFS Analyzer is mostly a single-threaded process and consumes up to one core during this execution time.

YARN, MapReduce & Tez Analyzer

Activity Analyzers deployed to the NameNodes in the cluster do not process any utilization data besides HDFS. Therefore, to process YARN, MapReduce, and Tez utilization data, another instance of the Activity Analyzer needs to be deployed to another node in the cluster, preferably on a non-master node. On startup, the Activity Analyzer will check to ensure that it’s not deployed to a NameNode, and then will begin to process YARN, MapReduce, and Tez utilization data. This Activity Analyzer individually starts and schedules analysis for YARN applications, MapReduce and Tez jobs. Both the YARN, and MapReduce and Tez analysis constantly polls for completed applications or jobs. Upon completion, each is analyzed and the utilization data is stored in the Ambari Metrics System