7. Known Issues

 

Table 1.12. Apache HBase

Apache JIRA 
Hortonworks Bug IDBUG-42355
Description

Moved application from HDP 2.2 to HDP 2.3 and now ACLs don't appear to be functioning the same

Workaround: Set hbase.security.access.early_out=false, as in the following example:

<property>
    <name>hbase.security.access.early_out</name>
    <value>false</value>
  </property>
  
Apache JIRAHBASE-13330, HBASE-13647
Hortonworks Bug IDBUG-36817
Descriptiontest_IntegrationTestRegionReplica Replication[IntegrationTestRegion ReplicaReplication] fails with READ FAILURES
  
Apache JIRA 
Hortonworks Bug IDBUG-39322
Description

The HBase bulk load process is a MapReduce job that typically runs under the user ID who owns the source data. HBase data files created as a result of the job are then bulk-loaded into HBase RegionServers. During this process, HBase RegionServers move the bulk-loaded files from the user's directory, and moves (renames) the files under the HBase root.dir (/apps/hbase/data). When HDFS data encryption is used, HDFS cannot rename across encryption zones with different keys.

Workaround: Run the MapReduce job as the hbase user, and specify an output directory in the same encryption zone as the HBase root directory.

  
Apache JIRAHBASE-13832, HDFS-8510
Hortonworks Bug IDBUG-40536
Description

When rolling upgrade is performed for HDFS, sometimes the HBase Master might run out of datanodes on which to keep its write-pipeline active. When this occurs, the HBase Master Aborts after a few attempts to keep the pipeline going. To avoid this situation:

Workaround:

  1. Before performing the rolling upgrade of HDFS, update the HBase configuration by setting "dfs.client.block.write.replace-datanode-on-failure.best.effort” to true.

  2. Restart the HBase Master.

  3. Perform the rolling upgrade of HDFS.

    Undo the configuration change done in Step 1.

    Restart the HBase Master.

Note: There is a window of time during the rolling upgrade of HDFS when the HBase Master might be working with just one node and if that node fails, the WAL data might be lost. In practice, this is an extremely rare situation.

Alternatively, the HBase Master can be turned off during the rolling upgrade of HDFS to avoid the above procedure. If this strategy is taken, client DDL operations and RegionServer failures cannot be handled during this time.

A final alternative if the HBase Master fails during rolling upgrade of HDFS, a manual start can be performed.

  
Apache JIRA 
Hortonworks Bug IDBUG-42186
Description

HDP 2.3 HBase install needs MapReduce class path modified for HBase functions to work

Cluster that have Phoenix enabled placed the following config in hbase-site.xml:

Property: hbase.rpc.controllerfactory.class
Value:org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory

This property points to a class found only in phoenix-server jar. To resolve this class at run time for the above listed Mapreduce Jobs, it needs to be part of the MapReduce classpath.

Workaround: Update mapreduce.application.classpath property in mapred-site.xml file to point to /usr/hdp/current/phoenix-client/phoenix-server.jar file.


 

Table 1.13. Apache Hive

Apache JIRAHIVE-11587
Hortonworks Bug IDBUG-42500
Description

Hive Hybrid Grace MapJoin can cause OutOfMemory Issues

Hive Hybrid Grace Mapjoin is a new feature in HDP 2.3 (Hive 1.2). Mapjoin joins two tables, holding the smaller one in memory. Grace Hybrid Mapjoin spills parts of the small table to disk when the Map Join does not fit in memory at runtime. Right now there is a bug in the code that can cause this implementation to use too much memory, causing an OutOfMemory error. This applies to the Tez execution engine only.

Workaround: Turn off hybrid grace map join by setting this property in hive-site.xml:

  • Navigate to Hive>Configs>Advanced>Custom hive-site.

  • Set hive.mapjoin.hybridgrace.hashtable=false.

  
Apache JIRAHIVE-11110
Hortonworks Bug IDBUG-39988
DescriptionCBO: Default partition filter is from MetaStore query causing TPC-DS to regress by 3x.
  
Apache JIRA 
Hortonworks Bug IDBUG-39412
Description

Users should not use datanucleus.identifierFactory = datanucleus2 in hive config.

Setting datanucleus.identifierFactory to datanucleus2 can potentially lead to data corruption if directSql is enabled. Avoid using this setting if you are setting up a new metastore. If you are migrating an old metastore with this configuration parameter already set, contact Support for a few steps to address the issue.

  
Apache JIRAHIVE-10978
Hortonworks Bug IDBUG-39282
Description

When HDFS is encrypted (data at rest encryption is enabled) and the Hadoop Trash feature is enabled, DROP TABLE and DROP PARTITION have unexpected behavior.

(The Hadoop Trash feature is enabled by setting fs.trash.interval > 0 in core-site.xml.)

When Trash is enabled, the data file for the table should be "moved" to the Trash bin, but if the table is inside an Encryption Zone, this "move" operation is not allowed.

Workaround: Here are two ways to work around this issue:

1. Use PURGE, as in DROP TABLE ... PURGE. This skips the Trash bin even if Trash is enabled.

2. set fs.trash.interval = 0. Caution: this configuration change must be done in core-site.xml. Setting it in hive-site.xml may lead to data corruption if a table with the same name is created later.

  
Apache JIRA 
Hortonworks Bug IDBUG-38785
Description

With RHEL7, the cpu and cpuacct controllers are managed together by default. The default directory is /sys/fs/cgroup/cpu,cpuacct. The presence of the comma leads to failures when initializing the NodeManager (when using the LinuxContainerExecutor).

Workaround: Create your own directory(such as /sys/fs/cgroup/hadoop/cpu) and set yarn.nodemanager.linux-container-executor.cgroups.mount to true. This will allow the NodeManager to mount the cpu controller, and YARN will be able to enforce CPU limits for you.

If you wish to mount the cgroups yourself (or provide a mount point), please set yarn.nodemanager.linux-container-executor.cgroups.mount to false and ensure that the hierarchy specified in yarn.nodemanager.linux-container-executor.cgroups.hierarchy exists in the mount location. Make sure there are no commas in your pathnames.

  
Apache JIRA 
Hortonworks Bug IDBUG-37042
Description

Limitations while using timestamp.formats serde parameter.

Two issues involving the timestamp.formats SerDe parameter:

  • Displays only 3 decimal digits when it returns values, but it accepts more decimal digits.

    For example, if you run the following commands:

    drop table if exists src_hbase_ts;

    create table src_hbase_ts( rowkey string, ts1 string, ts2 string, ts3 string, ts4 string ) STORED BY 'org.apache.hadoop.hive. hbase. HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = 'm:ts1,m:ts2,m:ts3,m:ts4') TBLPROPERTIES ('hbase.table.name' = 'hbase_ts');

    insert into src_hbase_ts values ('1','2011-01-01T01:01: 01.111111111', '2011-01-01T01:01: 01.123456111', '2011-01-01T01:01: 01.111111111', '2011-01-01T01:01: 01.134567890');

    drop table if exists hbase_ts_1;

    create external table hbase_ts_1( rowkey string, ts1 timestamp, ts2 timestamp, ts3 timestamp, ts4 timestamp ) STORED BY 'org.apache.hadoop.hive. hbase. HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping' = 'm:ts1,m:ts2,m:ts3,m:ts4', 'timestamp.formats' = "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSSS") TBLPROPERTIES ('hbase.table.name' = 'hbase_ts');

    select * from hbase_ts_1;

    The timestamp.formats parameter displays:

    1 2011-01-01 01:01:01.111 2011-01-01 01:01:01.123 2011-01-01 01:01:01.111 2011-01-01 01:01:01.134

    When the expected output is:

    1 2011-01-01 01:01:01.111111111 2011-01-01 01:01:01.123456111 2011-01-01 01:01:01.111111111 2011-0 

  • The yyyy-MM-dd’T'HH:mm:ss.SSSSSSSSS format accepts any timestamp data up to .SSSSSSSSS decimal digits (9 places to the left of the decimal) instead of only reading data with .SSSSSSSSS decimal digits (9 places to the left of the decimal).

    For example, if you run the following commands:

    drop table if exists src_hbase_ts; create table src_hbase_ts( rowkey string, ts1 string, ts2 string, ts3 string, ts4 string ) STORED BY 'org.apache.hadoop. hive. hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ('hbase.columns.mapping' = 'm:ts1,m:ts2,m:ts3,m:ts4') TBLPROPERTIES ('hbase.table.name' = 'hbase_ts');

    insert into src_hbase_ts values ('1','2011-01-01T01:01: 01.111111111', '2011-01-01T01:01: 01.111', '2011-01-01T01:01: 01.11', '2011-01-01T01:01:01.1');

    drop table if exists hbase_ts_1;

    create external table hbase_ts_1( rowkey string, ts1 timestamp, ts2 timestamp, ts3 timestamp, ts4 timestamp ) STORED BY 'org.apache.hadoop. hive. hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping' = 'm:ts1,m:ts2,m:ts3,m:ts4', 'timestamp.formats' = "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSSS") TBLPROPERTIES ('hbase.table.name' = 'hbase_ts');

    select * from hbase_ts_1;

    The actual output is:

    1 2011-01-01 01:01:01.111 2011-01-01 01:01:01.111 2011-01-01 01:01:01.11 2011-01-01 01:01:01.1

    When the expected output is:

    1 2011-01-01 01:01:01.111 NULL NULL NULL


 

Table 1.14. Apache Oozie

Apache JIRAOOZIE-2311
Hortonworks Bug IDBUG-39265
DescriptionNPE in oozie logs while running feed replication tests causes jobs to fail.

 

Table 1.15. Apache Ranger

Apache JIRARANGER_577
Hortonworks Bug IDBUG-38054
DescriptionRanger should not change Hive config if authorization is disabled

 

Table 1.16. Apache Slider

Apache JIRASLIDER-909
Hortonworks Bug IDBUG-40682
DescriptionSlider HBase app package fails in secure cluster with wire-encryption on

 

Table 1.17. Apache Spark

Apache JIRA 
Hortonworks Bug IDBUG-41644, BUG-41484
DescriptionApache and custom Spark builds need an HDP specific configuration. See the Troubleshooting Spark: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_spark-quickstart/content/ch_troubleshooting-spark-quickstart.html section for more details.
  
Apache JIRA 
Hortonworks Bug IDBUG-38046
Description

Spark ATS is missing Kill event

If a running Spark application is killed in the YARN ATS (yarn application -kill <appid>), the log will not list the outcome of the kill operation.

  
Apache JIRA 
Hortonworks Bug IDBUG-39468
Description

When accessing an HDFS file from pyspark, the HADOOP_CONF_DIR environment must be set. For example:

export HADOOP_CONF_DIR=/etc/hadoop/conf
[hrt_qa@ip-172-31-42-188 spark]$ pyspark
[hrt_qa@ip-172-31-42-188 spark]$ >>>lines = sc.textFile("hdfs://ip-172-31-42-188.ec2.internal:8020/tmp/PySparkTest/file-01")
.......

If HADOOP_CONF_DIR is not set properly, you might receive the following error:

Py4JJavaError: An error occurred while calling z:org.apache.spark.api. python.PythonRDD.collectAndServe.
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  
Available:[TOKEN, KERBEROS] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

  
Apache JIRA 
Hortonworks Bug IDBUG-39674
DescriptionSpark does not yet support wire encryption, dynamic executor allocation, SparkR, GraphX, Spark Streaming, iPython, or Zeppelin.

 

Table 1.18. Apache Tez

Apache JIRA 
Hortonworks Bug IDBUG-40608
Description

Tez UI View/Download link fails if URL does not match cookie.

Workaround: Tez UI View/Download link will work if a browser accesses a URL that matches the cookie.

Example: MapReduce JHS cookie is set with an external IP address. If a user clicks on the link from their internal cluster, the URL will differ and the request will fail with a dr.who error.


 

Table 1.19. Apache YARN

Apache JIRAYARN-2194
Hortonworks Bug IDBUG-39424
DescriptionNM fails to come with error "Not able to enforce cpu weights; cannot write to cgroup."
  
Apache JIRA 
Hortonworks Bug IDBUG-39756
DescriptionNM web UI cuts ?user.name when redirecting URL to MR JHS.
  
Apache JIRA 
Hortonworks Bug IDBUG-35942
Description

Users must manually configure ZooKeeper security with ResourceManager High Availability.

Right now, the default value of yarn.resourcemanager.zk-acl is world:any:rwcda. That means anyone can read/write/create/delete/setPermission for the znode which is not secure and not acceptable.

To make it more secure, we can rely on Kerberos to do the authentication for us. We could configure sasl authentication and only Kerberos authenticated user can access to zkrmstatestore.

ZooKeeper Configuration

Note: This step of securing ZooKeeper is to be done once for the HDP cluster. If this has been done to secure HBase, for example, then you do not need to repeat these ZooKeeper steps if Apache YARN ResourceManager High Availability is to use the same ZooKeeper.

  1. Create a keytab for zookeeper called zookeeper.service.keytab and save it in /etc/security/keytabs.

  2. Add following contents in zoo.cfg:

    authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
    jaasLoginRenew=3600000
    kerberos.removeHostFromPrincipal=true
    kerberos.removeRealmFromPrincipal=true
  3. Create zookeeper_client_jaas.conf:

    Client {
    com.sun.security.auth.module.Krb5LoginModule required
    useKeyTab=false
    useTicketCache=true;
    };
  4. Create zookeeper_jaas.conf:

    Server {
    com.sun.security.auth.module.Krb5LoginModule required
    useKeyTab=true
    storeKey=true
    useTicketCache=false
    keyTab="$PATH_TO_ZOOKEEPER_KEYTAB" 
    (such as"/etc/security/keytabs/zookeeper.service.keytab")
    principal="zookeeper/$HOST";
    (such as "zookeeper/xuan-sec-yarn-ha-2.novalocal@SCL42.HORTONWORKS.COM";)
    };
  5. Add the following contents in zookeeper-env.sh:

    export CLIENT_JVMFLAGS="-Djava.security.auth.login.config=/etc/zookeeper/conf/zookeeper_client_jaas.conf"
    export SERVER_JVMFLAGS="-Xmx1024m -Djava.security.auth.login.config=/etc/zookeeper/conf/zookeeper_jaas.conf"

Apache YARN Configuration

The following applies to HDP 2.2 and HDP 2.3.

Note: All nodes which launched the ResourceManager (active / standby) should make these changes.

  1. Create a new configuration file: yarn_jaas.conf under the directory that houses the Hadoop Core configurations - if this is /etc/hadoop/conf, then put in that directory.

    Client {
    com.sun.security.auth.module.Krb5LoginModule required
    useKeyTab=true
    storeKey=true
    useTicketCache=false
    keyTab="$PATH_TO_RM_KEYTAB" 
    (such as "/etc/security/keytabs/rm.service.keytab")
    principal="rm/$HOST";
    (such as "rm/xuan-sec-yarn-ha-1.novalocal@EXAMPLE.COM";)
    };
  2. Add a new property in yarn-site.xml. Assuming that ResourceManager logs in with a Kerberos principle of the form rm/_HOST@DOMAIN.COM.

    <property>
        <name>yarn.resourcemanager.zk-acl</name>
        <value>sasl:rm:rwcda</value>
      </property>
  3. Add a new YARN_OPTS into yarn-env.sh, and make sure this YARN_OPTS will be picked up when we start ResourceManagers.

    YARN_OPTS="$YARN_OPTS -Dzookeeper.sasl.client=true -Dzookeeper.sasl.client.username=zookeeper -Djava.security.auth.login.config=/etc/hadoop/conf/yarn_jaas.conf -Dzookeeper.sasl.clientconfig=Client"

HDFS Configuration

Note: This applies to HDP 2.1, 2.2, and 2.3.

  1. In hdfs-site.xml, set the following property, for security of ZooKeeper based fail-over controller, when NameNode HA is enabled:

    <property>
        <name>ha.zookeeper.acl</name>
        <value>sasl:nn:rwcda</value>
    </property>


 

Table 1.20. HDFS and Cloud Deployment

Apache JIRAHADOOP-11618, HADOOP-12304
Hortonworks Bug IDBUG-42065
Description

HDP 2.3: Cannot set non HDFS FS as default. This prevents S3, WASB, and GCC from working.

HDP cannot be configured to use an external file system as the default file system - such as Azure WASB, Amazon S3, Google Cloud Storage. The default file system is configured in core-site.xml using the fs.defaultFS property. Only HDFS can be configured as the default file system.

These external file systems can be configured for access as an optional file system, just not as the default file system.


 

Table 1.21. Upgrade

Apache JIRAHDFS-8782
Hortonworks Bug IDBUG-41215
Description

Upgrade to block ID-based DN storage layout delays DN registration.

When upgrading from a pre-HDP-2.2 release, a DataNode with a lot of disks, or with blocks that have random block IDs, can take a long time (potentially hours). The DataNode will not register to the NameNode until it finishes upgrading the storage directory.

  
Apache JIRA 
Hortonworks Bug IDBUG-32401
DescriptionRolling upgrade/downgrade should not be used if truncate is turned on. Workaround: Before starting a rolling upgrade or downgrade process, turn truncate off.