5.3. Known Issues for YARN

  • BUG-20371: Sleep job fails with "java.lang.OutOfMemoryError: Java heap space" with specific queue allocations

    Problem: Capacity scheduler sleep jobs fail with the following error:

    2014-07-21 11:36:30,539 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    2014-07-21 11:36:32,076 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: org.apache.hadoop.mapreduce.SleepJob$EmptySplit@5e00d708
    2014-07-21 11:36:32,186 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    2014-07-21 11:36:33,673 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
    	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:962)
    	at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:390)
    	at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:79)
    	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
    	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:746)
    	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
    	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.Subject.doAs(Subject.java:415)
    	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
    	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
    
    2014-07-21 11:36:33,681 INFO [communication thread] org.apache.hadoop.mapred.Task: Communication exception: java.io.IOException: Failed on local exception: java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/172.18.145.86:38856 remote=/172.18.145.81:56568]. 59999 millis timeout left.; Host Details : local host is: "ambari-sec-1405917050-yarn-6.cs1cloud.internal/172.18.145.86"; destination host is: "ambari-sec-1405917050-yarn-4.cs1cloud.internal":56568; 
    	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
    	at org.apache.hadoop.ipc.Client.call(Client.java:1351)
    	at org.apache.hadoop.ipc.Client.call(Client.java:1300)
    	at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
    	at com.sun.proxy.$Proxy6.ping(Unknown Source)
    	at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:736)
    	at java.lang.Thread.run(Thread.java:744)
    Caused by: java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/172.18.145.86:38856 remote=/172.18.145.81:56568]. 59999 millis timeout left.
    	at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:352)
    	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
    	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    	at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    	at java.io.FilterInputStream.read(FilterInputStream.java:133)
    	at java.io.FilterInputStream.read(FilterInputStream.java:133)
    	at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:457)
    	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    	at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
    	at java.io.DataInputStream.readInt(DataInputStream.java:387)
    	at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:995)
    	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
                        
  • BUG-20335: Backport BUG-10320 to HDP 2.0.12.0

    Problem: This was fixed in HDP 2.0.1.3. Evaluating whether or not it should be backported to HDP 2.0.12.0


loading table of contents...