Search code examples
javaoracle-databasetomcatsap-commerce-cloud

How to fix 'SQLRecoverableException: Closed Connection' in Java


We are working for an ecommerce built with Hybris framework and currently we have an issue with database connection (I suppose) and no idea on how to solve it. It happens only on production environment and only on servers that are used by ESB (2 servers in a total of 40).

Basically, sometimes (1-3/day), we discover sessions waiting for some idle session (SEL*NET message from client). We can only manually kill the holder in order to free these sessions.

All the servers share the same application code and the main difference between ESB and Frontend servers is in the controllers that are called and in the requests count.

ESB Server: 10 requests per minute Frontend Server: 300 requests per minute

In the application log I found a lot of Closed Connection errors on these 2 servers and I think that this is related to our problem but actually I don't know why. In access.log I have this request:

[26/Mar/2019:09:04:39 +0100] "GET /blockorder?orderCode=XXXX&access_token=XXXX HTTP/1.1" 400 122 "-" "AHC/1.0"

and in the console.log I have this:

hybrisHTTP8 2019-03-26 09:04:39,184 ERROR [[10.125.31.2] ] () [de.hybris.platform.jdbcwrapper.ConnectionImpl] error resetting AutoCommit
java.sql.SQLRecoverableException: Closed Connection
        at oracle.jdbc.driver.PhysicalConnection.setAutoCommit(PhysicalConnection.java:3763)
        at de.hybris.platform.jdbcwrapper.ConnectionImpl.doSetAutoCommit(ConnectionImpl.java:431)
        at de.hybris.platform.jdbcwrapper.ConnectionImpl.restoreAutoCommit(ConnectionImpl.java:185)
        at de.hybris.platform.jdbcwrapper.ConnectionImpl.unsetTxBound(ConnectionImpl.java:175)
        at de.hybris.platform.tx.Transaction.unsetTxBoundConnection(Transaction.java:920)
        at de.hybris.platform.tx.Transaction.clearTxBoundConnectionAndNotify(Transaction.java:897)
        at de.hybris.platform.tx.Transaction.clearTxBoundConnectionAndNotifyRollback(Transaction.java:887)
        at de.hybris.platform.tx.Transaction.rollbackOuter(Transaction.java:1084)
        at de.hybris.platform.tx.Transaction.rollback(Transaction.java:1028)
        at de.hybris.platform.tx.Transaction.commit(Transaction.java:690)
        at de.hybris.platform.tx.Transaction.finishExecute(Transaction.java:1218)
        at de.hybris.platform.tx.Transaction.execute(Transaction.java:1205)
        at de.hybris.platform.tx.Transaction.execute(Transaction.java:1160)
        at de.hybris.platform.jalo.Item.setAllAttributes(Item.java:2082)
        at de.hybris.platform.jalo.Item.setAllAttributes(Item.java:2057)
        at de.hybris.platform.servicelayer.internal.converter.impl.ItemModelConverter.storeAttributes(ItemModelConverter.java:1503)
        at de.hybris.platform.servicelayer.internal.converter.impl.ItemModelConverter.save(ItemModelConverter.java:730)
        at de.hybris.platform.servicelayer.internal.model.impl.wrapper.ModelWrapper.save(ModelWrapper.java:336)
        at de.hybris.platform.servicelayer.internal.model.impl.ResolvingModelPersister.saveOthers(ResolvingModelPersister.java:64)
        at de.hybris.platform.servicelayer.internal.model.impl.ResolvingModelPersister.persist(ResolvingModelPersister.java:49)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.saveViaJalo(DefaultModelService.java:1059)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.doJaloPersistence(DefaultModelService.java:648)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.persistWrappers(DefaultModelService.java:1002)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.performPersistenceOperations(DefaultModelService.java:626)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.saveAllInternal(DefaultModelService.java:620)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.saveAll(DefaultModelService.java:600)
        at de.hybris.platform.servicelayer.internal.model.impl.DefaultModelService.save(DefaultModelService.java:548)
        at com.test.fulfilment.process.impl.DefaultOrderProcessService.requestForcedOrderCancellation(DefaultOrderProcessService.java:131)
        at com.test.application.order.facades.impl.DefaultOrderFacade.forcedOrderCancel(DefaultOrderFacade.java:62)
        at com.test.application.controllers.OrderController.blockOrder(OrderController.java:520)

Our pool config is the following:

{
    "maxIdle": 90,
    "minIdle": 2,
    "maxActive": 90,
    "maxWait": 10000,
    "whenExhaustedAction": 1,
    "testOnBorrow": true,
    "testOnReturn": true,
    "testWhileIdle": true,
    "timeBetweenEvictionRunsMillis": 10000,
    "numTestsPerEvictionRun": 100,
    "minEvictableIdleTimeMillis": 300000,
    "softMinEvictableIdleTimeMillis": -1,
    "lifo": true
}

Our tomcat config is:

tomcat.generaloptions.JDBC=-Doracle.jdbc.ReadTimeout=60000
tomcat.generaloptions.TIMEOUT=-Dsun.net.client.defaultConnectTimeout\=60000 -Dsun.net.client.defaultReadTimeout\=60000
tomcat.ajp.acceptCount=100
tomcat.ajp.maxThreads=400
tomcat.maxthreads=400
tomcat.minsparethreads=50
tomcat.maxidletime=10000
tomcat.connectiontimeout=120000
tomcat.acceptcount=100

We tried to remove the oracle.jdbc.ReadTimeout but the result was that we started to see Closed Connections on the other servers.

The code that trigger this error is pretty simple (and it works in the 95% of time):

    @Override
    public boolean requestForcedOrderCancellation(final OrderModel order) {
        Transaction.current().begin();
        try {
            modelService.lock(order.getPk());
            modelService.refresh(order);
            order.setForcedCancelled(true);
            modelService.save(order);
            Transaction.current().commit();
            return true;
        catch (Exception e) {
            LOG.error(e.getMessage(), e);
            Transaction.current().rollback();
            return false;
        }
    }

We tried also without explicit locking and the problem is exactly the same. It seems like the connection is already closed and we cannot rollback (or commit) the transactions that are still waiting in DB.

I expect to avoid this lock and these closed connection errors.


Solution

  • We found that issue was due to uncatched exception/error in transactional code.

    Server answer with error and Hybris did not rollback the transaction that is still open. The same thread is reused sometime later (maybe some days) and old transaction is still open.

    When this corrupted thread is used for locking some rows in database, even if we commit the transaction in the code, the same is not committed to database because internally Hybris has a transaction counter to handle inner transactions (maybe used in called methods). Transaction is commited/rollback to DB only when we use commit/rollback method and transaction counter is 1.

    Request1:
        Transaction.begin() // Hybris Counter = 1
        doSomething() // This throws Exception, Application Exit, Hybris Counter is still 1
        try {
            Transaction.commit()
        } catch (Exception e) {
            Transaction.rollback();
        }
    
    Request2 on same thread:
        Transaction.begin() // Hybris Counter now is 2
        doSomething() // Works OK, Hybris Counter is still 2
        try {       
            Transaction.commit() // HybrisCounter -= 1
            // Transaction is not commited to DB because Hybris counter now is 1
        } catch (Exception e) {
            Transaction.rollback();
        }
    
    Request3 on same thread:
        Transaction.begin() // Hybris Counter now is 2
        lockRow() 
        // Row is locked for the whole transaction (the same opened in R1)
        // Everything is OK
        try {       
            Transaction.commit() // HybrisCounter -= 1
            // Transaction is not commited to DB because Hybris counter now is 1
            // Row is still locked
            // Next requests to the same row will wait lock forever
        } catch (Exception e) {
            Transaction.rollback();
        }