Search code examples
javajpaeclipselinkpersistencetoplink

DatabaseAccessor not connected in a high load scenario


I'm having an issue chasing down the culprit of a "EclipseLink-4005 (..) DatabaseAccessor not connected" error in our app. It's a standalone application (not a web-app) with multiple threads connecting to the same db (Oracle). We use EclipseLink 2.4.2 and its internal connection pool with default settings. Transactions are managed in the app and can be nested. The app works fine most of the time, but occasionally, under high load fails with the "EclipseLink-4005 (..) DatabaseAccessor not connected" error.

I was able to reproduce the error in our test environment but have trouble understanding under what conditions it might happen. Some transactions are long running. Is it possible that some of them time out, causing the respective session to log out and resulting in the error in another thread?

Below are sample stack traces:

Exception [EclipseLink-4005] (Eclipse Persistence Services - 2.4.2.v20130514-5956486): org.eclipse.persistence.exceptions.DatabaseException
Exception Description: DatabaseAccessor not connected.
    at org.eclipse.persistence.exceptions.DatabaseException.databaseAccessorNotConnected(DatabaseException.java:116)
    at org.eclipse.persistence.internal.databaseaccess.DatasourceAccessor.incrementCallCount(DatasourceAccessor.java:309)
    at org.eclipse.persistence.internal.databaseaccess.DatasourceAccessor.beginTransaction(DatasourceAccessor.java:238)
    at org.eclipse.persistence.internal.sessions.AbstractSession.basicBeginTransaction(AbstractSession.java:526)
    at org.eclipse.persistence.sessions.server.ClientSession.addWriteConnection(ClientSession.java:693)
    at org.eclipse.persistence.sessions.server.ServerSession.acquireClientConnection(ServerSession.java:246)
    at org.eclipse.persistence.sessions.server.ClientSession.getAccessor(ClientSession.java:333)
    at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.getAccessor(UnitOfWorkImpl.java:1867)
    at org.eclipse.persistence.internal.jpa.EntityManagerImpl.unwrap(EntityManagerImpl.java:2540)
    (..)

and

Exception [EclipseLink-4005] (Eclipse Persistence Services - 2.4.2.v20130514-5956486): org.eclipse.persistence.exceptions.DatabaseException
Exception Description: DatabaseAccessor not connected.
    at org.eclipse.persistence.exceptions.DatabaseException.databaseAccessorNotConnected(DatabaseException.java:116)
    at org.eclipse.persistence.internal.databaseaccess.DatasourceAccessor.incrementCallCount(DatasourceAccessor.java:309)
    at org.eclipse.persistence.internal.databaseaccess.DatasourceAccessor.rollbackTransaction(DatasourceAccessor.java:671)
    at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.rollbackTransaction(DatabaseAccessor.java:1576)
    at org.eclipse.persistence.internal.sessions.AbstractSession.basicRollbackTransaction(AbstractSession.java:629)
    at org.eclipse.persistence.sessions.server.ClientSession.basicRollbackTransaction(ClientSession.java:186)
    at org.eclipse.persistence.internal.sessions.AbstractSession.rollbackTransaction(AbstractSession.java:3561)
    at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.rollbackTransaction(UnitOfWorkImpl.java:4641)
    at org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork.rollbackTransaction(RepeatableWriteUnitOfWork.java:522)
    at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.release(UnitOfWorkImpl.java:4448)
    at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.rollback(EntityTransactionImpl.java:127)
    (..)

Please let me know if you need any more info. I'm happy to share it, as I'm currently out of ideas.


More info after enabling EclipseLink logging.

In the log I found that a connection gets disconnected shortly before the error occurs (see below). My first thought was that it's because it was marked as invalid (with DatabaseAccessor.setIsValid) but the only issue that happened before the error was an OptimisticLockException thrown 30min earlier, and it was gracefully handled by the app. Moreover, the same connection is then happily used for the next half an hour, so I don't think it was marked as invalid.

TRACE  [2017-09-01 10:01:04.851/-0700] [Worker-659:org.eclipse.persistence.transaction] [ ] 2017-09-01 10:01:04.851--ClientSession(436256688)--Connection(148152749)--Thread(Thread[Worker-659,5,main])-- commit transaction
TRACE  [2017-09-01 10:01:04.914/-0700] [Worker-659:org.eclipse.persistence.connection] [ ] 2017-09-01 10:01:04.914--ServerSession(441145695)--Connection(148152749)--Thread(Thread[Worker-659,5,main])-- Connection released to connection pool [default].
DEBUG  [2017-09-01 10:01:04.914/-0700] [Worker-659:org.eclipse.persistence.connection] [ ] 2017-09-01 10:01:04.914--ServerSession(441145695)--Connection(148152749)--Thread(Thread[Worker-659,5,main])-- disconnect
TRACE  [2017-09-01 10:01:04.929/-0700] [Worker-659:org.eclipse.persistence.transaction] [ ] 2017-09-01 10:01:04.929--UnitOfWork(122187888)--Thread(Thread[Worker-659,5,main])-- end unit of work commit
TRACE  [2017-09-01 10:01:04.929/-0700] [Worker-659:org.eclipse.persistence.transaction] [ ] 2017-09-01 10:01:04.929--UnitOfWork(122187888)--Thread(Thread[Worker-659,5,main])-- resume unit of work
TRACE  [2017-09-01 10:01:04.929/-0700] [Worker-659:org.eclipse.persistence.transaction] [ ] 2017-09-01 10:01:04.929--UnitOfWork(122187888)--Thread(Thread[Worker-659,5,main])-- release unit of work
TRACE  [2017-09-01 10:01:04.929/-0700] [Worker-659:org.eclipse.persistence.connection] [ ] 2017-09-01 10:01:04.929--ClientSession(436256688)--Thread(Thread[Worker-659,5,main])-- client released
TRACE  [2017-09-01 10:01:04.929/-0700] [Worker-659:org.eclipse.persistence.query] [ ] 2017-09-01 10:01:04.929--UnitOfWork(122187888)--Thread(Thread[Worker-659,5,main])-- Execute query ReadAllQuery(...)
TRACE  [2017-09-01 10:01:04.929/-0700] [Worker-659:org.eclipse.persistence.connection] [ ] 2017-09-01 10:01:04.929--ServerSession(441145695)--Connection(985262420)--Thread(Thread[Worker-659,5,main])-- Connection acquired from connection pool [default].
TRACE  [2017-09-01 10:01:04.961/-0700] [Worker-659:org.eclipse.persistence.connection] [ ] 2017-09-01 10:01:04.961--ServerSession(441145695)--Connection(985262420)--Thread(Thread[Worker-659,5,main])-- Connection released to connection pool [default].
WARN   [2017-09-01 10:01:04.961/-0700] [Worker-659:org.eclipse.persistence] [ ] 2017-09-01 10:01:04.961--UnitOfWork(122187888)--Thread(Thread[Worker-659,5,main])-- 
Local Exception Stack: 
Exception [EclipseLink-4005] (Eclipse Persistence Services - 2.4.2.v20130514-5956486): org.eclipse.persistence.exceptions.DatabaseException
Exception Description: DatabaseAccessor not connected.
    at org.eclipse.persistence.exceptions.DatabaseException.databaseAccessorNotConnected(DatabaseException.java:116)
    at org.eclipse.persistence.internal.databaseaccess.DatasourceAccessor.incrementCallCount(DatasourceAccessor.java:309)
    at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:581)
    at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:537)
    at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:1805)
    at org.eclipse.persistence.sessions.server.ServerSession.executeCall(ServerSession.java:566)
(..)

Is there any other reason why it could get disconnected?


Solution

  • Allow me to answer my own question, so others can benefit from my finding.

    The issue turned out to be caused by a bug in Eclipse Link. See bug 432410, at the time of writing still open.

    The solution in my particular case was to avoid unwraping Connection i.e. do not call EntityManager.unwrap(Connection.class)