Command Line Installation
Also available as:
PDF
loading table of contents...

Configure Hadoop

These configuration variables are in the [hadoop] section of the /etc/hue/conf/hue.ini configureation file:

  1. Configure an HDFS cluster.

    Hue only supports one HDFS cluster. Ensure that you define the HDFS cluster under the [hadoop][[hdfs_clusters]] [[[default]]] subsection of the /etc/hue/config/hue.ini configuration file.

    Use the following variables to configure the HDFS cluster:

    Variable

    Description

    Default/Example Value

    fs_defaultfs

    This is equivalent to fs.defaultFS (fs.default.name) in the Hadoop configuration.

    hdfs:// fqdn.namenode.host:8020

    webhdfs_url

    WebHDFS URL.

    The default value is the HTTP port on the NameNode. Example: http://fqdn.namenode.host:50070/webhdfs/v1

  2. Configure a YARN (MR2) Cluster.

    Hue supports only one YARN cluster.

    Ensure that you define the YARN cluster under the [hadoop][[yarn_clusters]] [[[default]]] sub-section of the /etc/hue/config/hue.ini configuration file.

    For more information regarding how to configure Hue with a NameNode HA cluster see see Deploy Hue with a ResourceManager HA Cluster in the High Availabiltiy for Hadoop Guide.

    Use the following variables to configure a YARN cluster:

    Variable

    Description

    Default/Example Value

    submit_to

    Set this property to true. Hue submits jobs to this YARN cluster. Note that JobBrowser is not able to show MR2 jobs.

    true

    resourcemanager_api_url

    The URL of the ResourceManager API.

    http://fqdn.resourcemanager.host:8088

    proxy_api_url

    The URL of the ProxyServer API.

    http://fqdn.resourcemanager.host:8088

    history_server_api_url

    The URL of the HistoryServer API.

    http://fqdn.historyserver.host:19888

    node_manager_api_url

    The URL of the NodeManager API.

    http://fqdn.resourcemanager.host:8042

  3. Configure Beeswax

    In the [beeswax] section of the of the /etc/hue/config/hue.ini configuration file, you can specify the following values:

    Variable

    Description

    Default/Example Value

    hive_server_host

    Host where Hive server Thrift daemon is running. If Kerberos security is enabled, use fully-qualified domain name (FQDN).

    hive_server_port

    Port on which HiveServer2 Thrift server runs.

    10000

    hive_conf_dir

    Hive configuration directory where hive-site.xml is located.

    /etc/hive/conf

    server_conn_timeout

    Timeout in seconds for Thrift calls to HiveServer2.

    120

    [Important]Important

    Depending on your environment and the Hive queries you run, queries might fail with an internal error processing query message.

    Look for an error message java.lang.OutOfMemoryError:

    GC overhead limit exceeded in the beeswax_serer.out log file. To increase the heap size to avoid this out of memory error, modify the hadoop-env.sh file and change the value of HADOOP_CLIENT_OPTS.

  4. Configure HiverServer2 over SSL (Optional)

    Make the following changes to the /etc/hue/conf/hue.ini configuration file to configure Hue to communicate with HiverServer2 over SSL:

    [[ssl]]
    SSL communication enabled for this server.
    enabled=falsePath to Certificate Authority certificates.
    cacerts=/etc/hue/cacerts.pemPath to the public certificate file.
    cert=/etc/hue/cert.pemChoose whether Hue should validate certificates received from the server.
    validate=true
  5. Configure JobDesigner and Oozie

    In the [liboozie] section of the /etc/hue/conf/hue.ini configuration file, specify the oozie_url, the URL of the Oozie service as specified by the OOZIE_URL environment variable for Oozie.

  6. Configure WebHCat

    In the [hcatalog] section of the /etc/hue/conf/hue.ini configuration file, set templeton_url, to the hostname or IP of the WebHCat server. An example could be http:// hostname:50111/templeton/v1/.