5. Known Issues - Hortonworks Data Platform

Ambari does not support running or installing stacks on Ubuntu.

The component version information displayed by Ambari is based on the Ambari Stack definition. If you have applied patches to the Stack and to your software repository, that component version might differ from the actual version installed. There is no functional impact on Ambari if the patch versions mismatch. If you have any questions on component versions, refer to the rpm version installed on the actual host.

BUG-24234: Unable to start/stop services when using Oracle database for Ambari.

Problem: If you are using Oracle for the Ambari DB, you can run into a scenario when performing a start all/stop all where Ambari becomes unresponsive and the following ORA error is printed to the ambari-server log:

08:54:51,320 ERROR [qtp1280560314-2070] ReadHandler:84 - Caught a runtime exception executing a query
Local Exception Stack: 
Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.4.0.v20120608-r11652): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000

Workaround: Please upgrade to Ambari 1.6.1 and contact Hortonworks Support for a patch to apply.

BUG-21182: After upgrading to Ambari 1.6.1, Agent machines can no longer connect to Server with Two-Way SSL enabled.

Problem: If you have enabled two-way SSL for Server-Agent communication, after upgrading to Ambari 1.6.1, you might encounter the following failure: Agents fail to re-connect to the Server due to a certification error. This can happen if you change the JDK by re-running ambari-server setup with the custom JDK path option, which causes the certificates to be invalidated on the Server. In this case, the Server no longer accepts the Agent certificates.

Workaround: You can disable two-way SSL, or you can delete the certificates on the Agent machines and restart.

BUG-20651: AMBARI-6702. RPM repository gets corrupted periodically on machines running Ambari Agents.

Problem: On hosts that are being managed by Ambari, the Ambari Agents are monitoring the yum repository for packages. In certain scenarios, the yum repository gets corrupted and the yum command fails. You will notice an ERROR message in the /var/log/ambari-agent/ambari-agent.log file indicating that agent PackagesAnalyzer failed. If you attempt to run an rpm or yum command on this host, they will also fail, which affects the ability of Ambari to start components on the host.

ERROR 2014-04-24 05:33:22,669 PackagesAnalyzer.py:43 - Task timed out and will be killed
...
ERROR: cannot open Packages database in /var/lib/rpm

Workaround: You need to repair the yum repository. Run the following on the affected host:

rm /var/lib/rpm/__db*
yum -vv --rebuilddb

BUG-20453: Problem starting Oozie when using external PostgreSQL with HDP 2.0 Stack.

Problem: If you use Ambari 1.6.1 to install the HDP 2.0 Stack and the Oozie Service with an external PostgreSQL database, Oozie will fail to start with the following error:

Fail: Execution of '/usr/jdk64/jdk1.7.0_45/bin/java -cp /usr/lib/ambari-agent/DBConnectionVerification.jar:/usr/lib/oozie/libserver/postgresql-9.0-801.jdbc4.jar org.apache.ambari.server.DBConnectionVerification jdbc:postgresql://172.18.145.25:5432/ooziedb oozieuser [PROTECTED] org.postgresql.Driver' returned 1. ERROR: Unable to connect to the DB. Please check DB connection properties.
java.lang.ClassNotFoundException: org.postgresql.Driver

Workaround: You must download a PostgreSQL JDBC JAR file and make it available to Ambari and Oozie, as follows:

Download postgresql-9.0-802.jdbc4.jar from http://jdbc.postgresql.org/download.html
Rename postgresql-9.0-802.jdbc4.jar to postgresql-9.0-801.jdbc4.jar
Create a required directory on the Oozie server host: mkdir -p /usr/lib/oozie/libserver/
Copy postgresql-9.0-801.jdbc4.jar to newly created dir: cp /path/to/jdbc/postgresql-9.0-801.jdbc4.jar /usr/lib/oozie/libserver/
Copy postgresql-9.0-801.jdbc4.jar to libext: cp /path/to/jdbc/postgresql-9.0-801.jdbc4.jar /usr/lib/oozie/libext/

BUG-20068: Error adding Storm on Kerberos-enabled cluster after upgrade.

Problem: After upgrading to Ambari 1.6.0 or 1.6.1 and upgrading from HDP 2.0 to 2.1 Stack, in a Kerberos-enabled cluster, you may have issues when starting Storm with the following error:

Fail: Configuration parameter 'storm_principal_name' was not found in configurations dictionary!

Workaround: After adding Storm, run the following commands on the Ambari Server to fix this issue:

cd /var/lib/ambari-server/resources/scripts/

./configs.sh -u admin -p <password> set localhost <clustername> global storm_principal_name <storm_principal_name>
./configs.sh -u admin -p <password> set localhost <clustername> global storm_keytab /etc/security/keytabs/storm.service.keytab
# When adding Falcon
./configs.sh -u admin -p <password> set localhost <clustername> global hdfs_user_keytab /etc/security/keytabs/hdfs.headless.keytab

BUG-20060: Host registration fails during Agent bootstrap on SLES due to timeout.

Problem: When using SLES and performing host registration using SSH, there are scenarios where Agent bootstrap fails due to timeout when running the setupAgent.py script. On the host with the timeout, you will see the following process hanging:.

c6401.ambari.apache.org:/etc/ # ps -ef | grep zypper
root     18318 18317  5 03:15 pts/1    00:00:00 zypper -q search -s --match-exact ambari-agent

Workaround: It is possible you have a repository registered that is prompting to accept keys, which needs user interaction (hence the hang and timeout). You should run "zypper refresh" and confirm all repository keys are accepted for the zypper command to work without user interaction. Another alternative is to perform manual Agent setup and not use SSH for host registration. This option does not require that Ambari call zypper without user interaction.

BUG-20035: Enabling NameNode HA wizard fails on the "Initialize JournalNode" step.

Problem: After upgrading to Ambari 1.6.1 and attempting to enable NameNode HA in a HDP 2.x Stack-based cluster, the HA wizard fails to complete with an error during the "Initialize JournalNode" step. This failure situation can also occur if your cluster was created using a Blueprint.

Workaround: Using the Ambari REST API, you need to create JournalNode and ZKFC service components. This API can also be called prior to launching the NameNode HA wizard to avoid the wizard failing.

curl --user admin:admin -H  "X-Requested-By: ambari" -i -X POST -d '{"components":[{"ServiceComponentInfo":{"component_name":"JOURNALNODE"}},{"ServiceComponentInfo":{"component_name":"ZKFC"}}]}' 
http://ambari.server:8080/api/v1/clusters/c1/services?ServiceInfo/service_name=HDFS

Replace ambari.server and c1 with your Ambari Server hostname and cluster name, respectively.

BUG-20031: Ganglia Monitors display no data after upgrading Ambari to version 1.6.1.

Problem: Due to an ownership change of local pid files while upgrading Ambari to 1.6.1, you may see some of the Ganglia Monitors as down. Re-starting monitors from the Ambari UI appears to succeed, but monitors will not restart successfully.

Workaround: On each host where ganglia gmond is not running - run the following command as root:

 rm /var/run/ganglia/hdp/*.pid

This command deletes the old pid files that have different ownership. After deleting these files, start the gmond processes, using the Ambari Web UI.

BUG-19973: Background Operations pop-up window does not appear.

Problem: In rare cases, the Background Operations pop-up window does not automatically appear after triggering an operation such as Service start/stop. This behavior may happen even if the user preference setting "Do not show the Background Operations dialog when starting an operation" is turned off.

Workaround: To see the Background Operations status, click on the cluster name in the top menu navigation bar.

BUG-19934: Restart All is slow for 2,000 node cluster.

Problem: After selecting "Restart All" from the Services Action menu, the UI could take over a minute to respond in a 2,000 node cluster as Ambari queues all the DataNode component restarts.

Workaround: To avoid this delay, use "Restart DataNodes" to perform a rolling restart of the DataNodes with manageable batch size.

BUG-19704: AMBARI-5748 Cluster install fails with groupmod error.

Problem: The cluster fails to install with an error related to running "groupmod". This can occur in environments where groups are managed in LDAP, and not on local Linux machines.

Fail: Execution of 'groupmod hadoop' returned 10. groupmod: group 'hadoop' does not exist in /etc/group

Workaround: During the cluster install, on the Customize Services step for the Install Wizard, select the Misc tab and check the “Skip group modifications during install” option.

BUG-19529: Problem renaming Host Config Group.

Problem: Under certain circumstances, when attempting to rename a Host Config Group, after clicking OK in the dialog, the UI produces a JavaScript error. Clicking OK a second time saves the Host Config Group name change but UI will show the name change against the default Host Config Group, although the name change is made and saved correctly..

Workaround: Refresh the page for the correct Host Config Group names to be displayed.

BUG-19433: Problem starting NodeManager on newly added hosts.

Problem: After adding new hosts to a cluster, and only assigning NodeManager slave component to the host (i.e. no DataNode or Clients component), the NodeManager will fail to start on the hosts with the following error:

Fail: Execution of 'hadoop fs -mkdir `rpm -q hadoop | grep -q "hadoop-1" || echo "-p"` /app-logs /mapred /mapred/system /mr-history/tmp /mr-history/done && hadoop fs -chmod -R 777 /app-logs && hadoop fs -chmod  777 /mr-history/tmp && hadoop fs -chmod  1777 /mr-history/done && hadoop fs -chown  mapred /mapred && hadoop fs -chown  hdfs /mapred/system && hadoop fs -chown  yarn:hadoop /app-logs && hadoop fs -chown  mapred:hadoop /mr-history/tmp /mr-history/done' returned 1. mkdir: No FileSystem for scheme: hdfs
mkdir: No FileSystem for scheme: hdfs
mkdir: No FileSystem for scheme: hdfs
mkdir: No FileSystem for scheme: hdfs
mkdir: No FileSystem for scheme: hdfs

Workaround: Add the Clients to the host, using the +Add button on the Ambari Web > Hosts page.

BUG-18896: Services checks run, using "Start All" when all hosts are in maintenance mode.

Problem: When all hosts are in maintenance mode (and all services are stopped), if you perform a Start All Services command, service checks are still performed. The services checks will fail since all services are stopped.

Workaround: The failing service checks are benign since all services are stopped.

BUG-18520: AMBARI-6061 For large clusters (2,000 nodes), increase Ambari Server max heap size.

Problem: When installing a large cluster (2,000 nodes) with Ambari, Ambari server will exit with an OutOfMemoryError during the Host Checks step in the Cluster Install Wizard.

Workaround: Increase the Ambari Server max heap size to 4096m, using the following steps:

On the Ambari Server, edit /var/lib/ambari-server/ambari-env.sh .
Adjust the AMBARI_JVM_ARGS variable, increase the value of -Xmx2048m to -Xmx4096m .
Start the Ambari Server, as usual.

BUG-17511: AMBARI-5883: Ambari installs but does not deploy additional .jar files in oozie.war to support HDP-1 oozie-hive workflows.

Problem: Manual configuration required to deploy additional ,jar files, post-install.

Workaround: After installing or upgrading to Ambari 1.6.1, use Ambari Web > Services > Config to add the following property to the oozie-sitel.xml configuration:

<property>
<name>oozie.credentials.credentialclasses</name>
<value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value>
</property>

BUG-16556: AMBARI-5435: "Connection refused" errors in the YARN application logs. Timeline service is not started, but yarn-site.xml has the timeline-related configuration enabled.

Problem: ATS is turned off in secure clusters installed by Ambari but in the yarn-site.xml, the ATS config is set to true. As a result, there are "Connection refused" errors in the YARN application logs.

Workaround: In Ambari Web, browse to Services > YARN > Configs. In the yarn-site.xml section, set the following property to false:

<property>
      <name>yarn.timeline-service.enabled</name>
      <value>false</value>
    </property>

BUG-16534: Quick links to Oozie Web UI and Falcon Web UI do not work after reconfiguring port for oozie.base.url .

Description: This occurs because the Oozie HTTP port (11000) and Admin port (11001) cannot be changed via Ambari. Oozie uses 11001 as the default Admin port.

Workaround: Reset the Oozie HTTP port and Admin port to 11000 and 11001, respectively.

BUG-13062: Upgrade from Ambari 1.4.1 to Ambari 1.6.1 server start fails on CentOS 5.

Problem: After upgrading Ambari from 1.4.1 to Ambari 1.6.1, on CentOS 5, ambari-server start will fail to the first time.

Workaround: Execute the ambari-server start command a second time.

BUG-7442: Must set hadoop.http.filter.initializers for SPNEGO access.

Problem: After enabling Kerberos security, you must set the hadoop.http.filter.initializers property in HDFS core-site.xml to enable SPNEGO access.

Workaround: To enable SPNEGO access to a secure cluster, do the following steps:

In Ambari Web, browse to Services > HDFS > Configs.

Add the following property to Custom core.xml section:

Property	Value
hadoop.http.filter.initializers	org.apache.hadoop.security.AuthenticationFilterInitializer

BUG-17511: AMBARI-6012 WebHCat jobs do not run after upgrading HDP 1.3 to HDP 2.1.

Problem: Upgrading stack does not ensure that required jar files move to oozie host.

Workaround: To get all webchat and hive jars for HDP 1.3, add the HIVE_CLIENT and HCAT_CLIENT on the Oozie Server via APIs.

AMBARI-4825: ATS component of YARN fails to start.

Problem: When installing the patch release of HDP 2.1.2 or HDP 2.1.3, the Application Timeline Server (ATS) component of YARN fails to start with the following error:

Fail: Execution of 'ls /var/run/hadoop-yarn/yarn/yarn-yarn-historyserver.pid>/dev/null 2>&1 && ps'cat /var/run/hadoop-yarn/yarn/yarn-yarn-historyserver.pid'>/dev/null 2>&1' returned 1.

Workaround: You must change the following YARN configuration property, during the Customize Services step of the Cluster Install Wizard, or after cluster install, browse to Ambari Web > Services > YARN Configs.

If you install and use HDP 2.1.2 or 2.1.1, use the following configuration:

yarn.timeline-service.store-class=org.apache.hadoop.yarn.server.applicationhistoryservice.timeline.LeveldbTimelineStore

If you install and use HDP 2.1.3, use the following configuration:

yarn.timeline-service.store-class=org.apache.hadoop.yarn.server.timeline.LeveldbTimelineStore