loading table of contents...

Chapter 8. Using the Falcon View

Apache Falcon solves enterprise challenges related to Hadoop data replication, business continuity, and lineage tracing by deploying a framework for data management and processing. The Falcon framework can also leverage other HDP components, such as Apache Pig, Apache Hadoop Distributed File System (HDFS), Apache Sqoop, Apache Hive, Apache Spark, and Apache Oozie. Falcon enables this simplified management by providing a framework to define and manage backup, replication, and data transfer.

Hadoop administrators can use the Falcon View to centrally define, schedule, and monitor data management policies. Falcon uses those definitions to auto-generate workflows in Apache Oozie.

This chapter describes the following:

1. Configuring Your Cluster

For the Falcon View to access HDFS, the Ambari Server daemon hosting the view needs to act as the proxy user for HDFS. This allows Ambari to submit requests to HDFS on behalf of the users using the Falcon View. This is critical since the Falcon View stores metadata about the user Falcon entity definitions. This also means users who access the Falcon View must have a user directory setup in HDFS.

1.1. Setup HDFS Proxy User

To set up an HDFS proxy user for the Ambari Server daemon account, you need to configure the proxy user in the HDFS configuration. This configuration is determined by the account name the ambari-server daemon is running as. For example, if your ambari-server is running as root, you set up an HDFS proxy user for root with the following:

  1. In Ambari Web, browse to Services > HDFS > Configs.

  2. Under the Advanced tab, navigate to the Custom core-site section.

  3. Click Add Property… to add the following custom properties:

    hadoop.proxyuser.root.groups="users"
    hadoop.proxyuser.root.hosts=ambari-server.hostname

    Notice the ambari-server daemon account name root is part of the property name. Be sure to modify this property name for the account name you are running the ambari-server as. For example, if you were running ambari-server daemon under an account name of ambariusr, you would use the following properties instead:

    hadoop.proxyuser.ambariusr.groups="users"
    hadoop.proxyuser.ambariusr.hosts=ambari-server.hostname

    Similarly, if you have configured Ambari Server for Kerberos, be sure to modify this property name for the primary Kerberos principal user. For example, if ambari-server is setup for Kerberos using principal ambari-server@EXAMPLE.COM, you would use the following properties instead:

    hadoop.proxyuser.ambari-server.groups="users"
    hadoop.proxyuser.ambari-server.hosts=ambari-server.hostname
  4. Save the configuration change and restart the required components as indicated by Ambari.

1.2. Setup HDFS User Directory

The Falcon View stores user metadata in HDFS. By default, the location in HDFS for this metadata is /user/${username} where ${username} is the username of the currently logged in user that is accessing the Falcon View.

[Important]Important

Since many users leverage the default Ambari admin user for getting started with Ambari, the /user/admin folder needs to be created in HDFS. Therefore, be sure to create the admin user directory in HDFS using these instructions prior to using the view.

To create user directories in HDFS, do the following for each user you plan to have use the Hive View.

  1. Connect to a host in the cluster that includes the HDFS client.

  2. Switch to the hdfs system account user.

    su - hdfs
  3. Using the HDFS client, make an HDFS directory for the user. For example, if your username is admin, you would create the following directory.

    hadoop fs -mkdir /user/admin
  4. Set the ownership on the newly created directory. For example, if your username is admin, you would make that user the directory owner.

    hadoop fs -chown admin:hadoop /user/admin

2. Installing and Configuring the Falcon View

You must manually copy the .jar file for the Falcon View, then configure Ambari to access the View. You can install the Falcon View in a secure or an unsecure cluster. If using a secure cluster, Ambari and Falcon must be properly configured with Kerberos.

Prerequisites

Steps

  1. Copy the Falcon View falcon-ambari-view.jar file from the Falcon server /webapp directory to the Ambari server /views directory.

    • If the Falcon and Ambari servers are on the same host, use the copy command:

      cp /usr/hdp/current/falcon-server/server/webapp/falcon-ambari-view.jar /var/lib/ambari-server/resources/views/
    • If the Falcon server is on a remote host, use the secure copy command for your operating system.

      A key pair might be required. See your operating system documentation for more information about remote copies.

  2. Restart the Ambari server.

    [root@DataMovementDocs-1 ~]# ambari-server restart
  3. In Ambari, navigate to user_name > Manage Ambari.

  4. Under Deploy Views, click Views, then click Falcon > Create Instance in the Views list.

  5. Provide the required Details information.

    Instance Name: 250 characters, no spaces, no special characters
    Display Name: 250 characters, including spaces; no special characters; can be the same as the Instance Name
    Description: 140 characters max, including spaces; special characters allowed
    [Note]Note

    If you enter more than the allowed number of characters, you might see the error message Cannot create instance: Server Error.

  6. Select a cluster configuration.

    The Local and Remote fields populate with the names of available clusters. The authentication type for the cluster is automatically recognized.

    To use a custom cluster location, enter the Falcon service URI and authentication type of simple or kerberos.

  7. Click Save.

    The Permissions section displays at the bottom of the Views page.

  8. (Optional) Set the permissions for access to the view.

  9. Hover over the Views icon to verify that your Falcon View is available in the menu.

    [Note]Note

    Do not click on the Falcon link yet. You must make additional configuration changes before you can access the Falcon View.

  10. Click the Ambari icon to return to the Dashboard window, then click the Falcon service and the Configs tab.

  11. Scroll to the Falcon startup.properties section, locate the *.application.services field, and enter the following services immediately above the line org.apache.falcon.metadata.MetadataMappingService:

    org.apache.falcon.service.GroupsService,\   

    org.apache.falcon.service.ProxyUserService,\

  12. Add the proxy user for hosts and groups in the Custom falcon-runtime.properties section.

    The proxy user is the user that the Falcon process runs as, typically Falcon.

    1. Click Add Property.

    2. Add the following key/value pairs.

      Substitute #USER# with the proxy user configured for the Ambari server.

      • Key=*.falcon.service.ProxyUserService.proxyuser.#USER#.hosts, Value=*

        These are the hosts from which #USER# can impersonate other users.

      • Key=*.falcon.service.ProxyUserService.proxyuser.#USER#.groups, Value=*

        These are the groups that the users being impersonated must belong to.

    Example 8.1. Substitute #USER#

    In the key/value pairs above, if the #USER# is “falcon”, enter *.falcon.service.ProxyUserService.proxyuser.falcon.hosts.


    The wildcard value=* (asterisk) is used to allow impersonation from any host or of any user. If you don't use the wildcard character, enter the appropriate host or group values.

  13. Click Save on the information bar at the top of the Configs page.

    If you try to leave the page without clicking Save, you see a Warning message. Click Save in the Warning dialog box.

    A Restart Required message displays at the top of the Falcon Configs page.

  14. Click Restart > Restart All Affected to restart the Falcon services.

  15. When the restart completes, verify that you can access the Falcon View by clicking Falcon in the Views menu.

3. Accessing the Falcon Documentation

You can access the Falcon documentation in the Data Movement and Integration guide on the Hortonworks documentation website.