Search code examples
javagwtgarbage-collectionheap-dumppermgen

How analyse an OutOfMemory GWT Error and customize heap?


I work on a old Java 1.6 application which use GWT 2.4.0 Since few weeks, it started to show a lot of OutOfMemoryError, and I can't find why... I can't do a lot of modification on this old app, I can't migrate to JDK 1.8, or migrate to a newer GWT version, too much risky.

Example of stacktrace :

2021-01-27 09:40:39,937 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[localhost].[/borneIntranet]] Exception while dispatching incoming RPC call
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2734)
        at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
        at java.util.ArrayList.add(ArrayList.java:351)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.append(ServerSerializationStreamWriter.java:583)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeInt(AbstractSerializationStreamWriter.java:100)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeString(AbstractSerializationStreamWriter.java:134)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$10.write(ServerSerializationStreamWriter.java:165)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:536)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.client.rpc.core.java.util.Collection_CustomFieldSerializerBase.serialize(Collection_CustomFieldSerializerBase.java:44)
        at com.google.gwt.user.client.rpc.core.java.util.ArrayList_CustomFieldSerializer.serialize(ArrayList_CustomFieldSerializer.java:39)
        at com.google.gwt.user.client.rpc.core.java.util.ArrayList_CustomFieldSerializer.serializeInstance(ArrayList_CustomFieldSerializer.java:51)
        at com.google.gwt.user.client.rpc.core.java.util.ArrayList_CustomFieldSerializer.serializeInstance(ArrayList_CustomFieldSerializer.java:28)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:740)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$8.write(ServerSerializationStreamWriter.java:153)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:539)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$8.write(ServerSerializationStreamWriter.java:153)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:539)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.client.rpc.core.java.util.Collection_CustomFieldSerializerBase.serialize(Collection_CustomFieldSerializerBase.java:44)
2021-01-27 09:40:41,594 ERROR [STDERR] 8312399 [scheduler_QuartzSchedulerThread] ERROR org.quartz.core.ErrorLogger - An error occurred while scanning for the next triggers to fire.
2021-01-27 09:40:42,206 ERROR [STDERR] org.quartz.JobPersistenceException: Failed to obtain DB connection from data source 'springNonTxDataSource.scheduler': java.sql.SQLException: Exception d'E/S: Connection reset [See nested exception:$
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:168)
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3784)
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
2021-01-27 09:40:42,208 ERROR [STDERR] Caused by: java.sql.SQLException: Exception d'E/S: Connection reset
2021-01-27 09:40:42,208 ERROR [STDERR]  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:111)
2021-01-27 09:40:42,208 ERROR [STDERR]  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:145)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:254)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:386)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:413)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:164)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:34)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:752)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.springframework.scheduling.quartz.LocalDataSourceJobStore$2.getConnection(LocalDataSourceJobStore.java:134)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.quartz.utils.DBConnectionManager.getConnection(DBConnectionManager.java:108)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:165)
2021-01-27 09:40:42,209 ERROR [STDERR]  ... 3 more

I try this :

What is weird, is when I watch JMX stats, the heapspace is 25% used, and permspace is 40% used. But in server.log, I see the outOfMemoryError, so how it's possible ?

The server have 5120M of RAM.

The actual launch parameters are :

/opt/jdk/bin/java -Dprogram.name=run.sh -server -Xms2560m -Xmx2560m -XX:MaxPermSize=768M 
-Djava.awt.headless=true -Xdebug -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n 
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9999 
-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false 
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/
-Djava.net.preferIPv4Stack=true ...

I don't know how to process. I don't have much application log either, so I can't know what users really do, like display a GWT screen which use a lot of data, or something else.

EDIT :

Here is some gc log files, each generated in a day, until the app crashed with an OOM.

https://file.io/e1140Wc80yhd

https://file.io/8jRxM1KIgPPB

https://file.io/WvUgIX3ftQ1o


Solution

  • I finally solved my OOM problem with the following steps :

    1\ Create thread dump every 10 seconds, and keep the last 4 hours of dumps. I can't store full heap dump because I'm limited in space disk, so with this method, I only store a lot of "small" files. And I can follow the activity minutes before the OOM and after.

    I modify my crontab to add :

    #Check threads for OOM
    * * * * * ( pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
    * * * * * ( sleep 10 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
    * * * * * ( sleep 20 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
    * * * * * ( sleep 30 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
    * * * * * ( sleep 40 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
    * * * * * ( sleep 50 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
    0 * * * * find /tmp -name "jstack*" -type f -mmin +240 -delete
    

    2\

    When an OOM happens. I download the thread dump files around 20 minutes before and 10 seconds after the time of OOM.

    3\

    I open all the files, and search for a class linked to my own app : "com.al6.xxxxx"

    I see that there is a thread wich begins like 10 minutes before OOM, and base on his ajp-xxx number, I found it in every file until the OOM :

    "ajp-0.0.0.0-8009-23" #7088 daemon prio-5 os-prio-0 tid-0x00007f7ffle1b000 nid-0x673 runnable [0x000000004cfa8000]
    java. lang. Thread. State: RUNNABLE
    at java.util.Arrays.copy0f(Arrays. java:3181)
    at java.util.ArrayList.grow(ArrayList. java: 265)
    at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239)
    at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231)
    at java.util.ArrayList.add(ArrayList. java:462)
    at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.append(ServerSerializationStreamWriter.java:583)
    at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeInt(AbstractSerializationStreamWriter.java:100)
    at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeString(AbstractSerializationStreamWriter.java:134)
    at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$10.write(ServerSerializationStreamWriter.java:165)
    at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:536) at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
    at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
    at com.google.gwt.user .server. rpc. impl.ServerSerializationStreanWriter.serialize(ServerSerializationStreamWriter.java:621)
    at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObjectAbstractSerializationStreamWriter.java:126)
    

    So, it's the guilty thread, clearly ! But I only have always the same stack tracelinked to GWT, without the launcher/parent of this running code.

    So searching with the name "ajp-0.0.0.0-8009-23", I finally trace back the thread through his creation, and found a portion of "server code" !

    com.al6.serviceBaseEntry.getXXXXX 
    

    This code return an ArrayList with a lot of elements in it, sometimes 100, sometimes 1500. It's not a lot, but for each element of the list, there is like 30 attributes. This list of element is sent to GWT to generate a drop-down list with autocompletion.

    In theory, the user had to type a sufficiently restrictive filter to return 10-20 elements. But sometimes, the user type a very generic filter, and so the method server side return 1500 elements. GWT tries to serialize the 1500 elements with all their attributes, and boum, OOM.

    I just modify the IHM side code to force the user to enter an enough restrictive filter, and made a check in server side, limiting the result number to 100.

    Since, no more OOM, it's solved !

    In complement, for those who can, think about update GWT, even a little, there is some changes which add details on stack trace. And in my case of OOM, I think with a newer version of GWT, I would have more details on the cause of the OOM in the final stack trace, without doing all this search process.