Can anyone please help me to identify the exact are of problem. Is it either JVM, Log4j or something else in our application?
We are running a multi-threaded application using JDK 1.6.0.24 on Solaris 10 (SUNW,Sun-Fire-V240) Server. Which has RMI call to communicate to the client.
Our application went hanged.I have seen the below OutOfMemory in threaddump. However, I know it is because of GC is only able to claim 2% of the object memory.
# java.lang.OutOfMemoryError: GC overhead limit exceeded Heap PSYoungGen total 304704K, used 154560K [0xe0400000, 0xfbc00000, 0xfbc00000) eden space 154560K, 100% used [0xe0400000,0xe9af0000,0xe9af0000) from space 150144K, 0% used [0xf2960000,0xf2960000,0xfbc00000) to space 145856K, 0% used [0xe9af0000,0xe9af0000,0xf2960000) PSOldGen total 897024K, used 897023K [0xa9800000, 0xe0400000, 0xe0400000) object space 897024K, 99% used [0xa9800000,0xe03ffff0,0xe0400000) PSPermGen total 28672K, used 27225K [0xa3c00000, 0xa5800000, 0xa9800000) object space 28672K, 94% used [0xa3c00000,0xa5696580,0xa5800000)
In my case it should be because GC can not claim memory from lots of waiting threads. If I see the Thread Dump. Most of the thread are waiting to acquire the lock on org.apache.log4j.Logger. Using log4j-1.2.15
If you see the first thread's trace below. It acquires lock on 2 objects and other threads (~50) are waiting to get the lock. Almost same trace can be seen for 20 minutes.
Here is the thread dump:
pool-3-thread-51" prio=3 tid=0x00a38000 nid=0xa4 runnable [0xa0d5f000] java.lang.Thread.State: RUNNABLE at java.text.DateFormat.format(DateFormat.java:316) at org.apache.log4j.helpers.PatternParser$DatePatternConverter.convert(PatternParser.java:443) at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65) at org.apache.log4j.PatternLayout.format(PatternLayout.java:506) at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310) at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276) at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) - locked (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.parseAndQueueData(NioNotificationHandler.java:296) at com.airvana.faultServer.niohandlers.NioNotificationHandler.messageReceived(NioNotificationHandler.java:145) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:105) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) "Timer-3" prio=3 tid=0x0099a800 nid=0x53 waiting for monitor entry [0xa1caf000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.spi.RootLogger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.controlapp.export.AbstractOMDataCollector.run(AbstractOMDataCollector.java:100) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) "TrapHandlerThreadPool:Thread-10" prio=3 tid=0x014dac00 nid=0x4f waiting for monitor entry [0xa1d6f000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.db.ConnectionPool.printDataSourceStats(ConnectionPool.java:146) at com.airvana.faultServer.db.SQLUtil.freeConnection(SQLUtil.java:267) at com.airvana.faultServer.db.DbAPI.addEventOrAlarmOptimized(DbAPI.java:904) at com.airvana.faultServer.eventProcessing.EventProcessor.processEvent(EventProcessor.java:24) at com.airvana.faultServer.filters.BasicTrapFilter.processTrap(BasicTrapFilter.java:80) at com.airvana.faultServer.eventEngine.EventEngine.notifyTrapProcessors(EventEngine.java:314) at com.airvana.faultServer.eventEngine.NodewiseTrapQueue.run(NodewiseTrapQueue.java:94) at com.airvana.common.utils.ThreadPool$PoolThread.run(ThreadPool.java:356) "RMI TCP Connection(27927)-10.193.3.41" daemon prio=3 tid=0x0186c800 nid=0x1d53 waiting for monitor entry [0x9f84e000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:231) - waiting to lock (a org.apache.log4j.RollingFileAppender) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) - locked (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.processCommunications.ConfigAppCommReceiver.sendEvent(ConfigAppCommReceiver.java:178) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) "pool-3-thread-49" prio=3 tid=0x01257800 nid=0xa1 waiting for monitor entry [0xa0def000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.processSeqNumber(NioNotificationHandler.java:548) at com.airvana.faultServer.niohandlers.NioNotificationHandler.parseAndQueueData(NioNotificationHandler.java:301) at com.airvana.faultServer.niohandlers.NioNotificationHandler.messageReceived(NioNotificationHandler.java:145) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:105) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:385) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:324) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:306) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:223) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:149) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipe "pool-3-thread-44" prio=3 tid=0x00927800 nid=0x9b waiting for monitor entry [0xa0f0f000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.log4j.Category.callAppenders(Category.java:204) - waiting to lock (a org.apache.log4j.Logger) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.info(Category.java:666) at com.airvana.faultServer.niohandlers.NioNotificationHandler.parseAndQueueData(NioNotificationHandler.java:296) at com.airvana.faultServer.niohandlers.NioNotificationHandler.messageReceived(NioNotificationHandler.java:145) at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:105) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:385) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:324) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:306) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:223) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.timeout.ReadTimeoutHandler.messageReceived(ReadTimeoutHandler.java:149) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:385) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:324) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:306) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:221) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:87) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:567) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:803) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:76) at org.jboss.netty.handler.execution.OrderedMemoryAwareThreadPoolExecutor$ChildExecutor.run(OrderedMemoryAwareThreadPoolExecutor.java:314) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)
This problem description appears to be similar to that of bug 41214. I'm not sure if this related to your problem, but some of the elements in your posted stack trace are similar to the ones in that bug.
You might therefore, want to follow up on Stephen's advice to verify if you have too many logger calls from multiple threads leading to a lot of lock contention. Switching the level of the logger to a higher level might help, although this is not advisable if you need the log entries; I would advise that this be used after careful deliberation.