our java web application is currently experiencing random (and never experienced before) block issues caused by a sudden spike in threads created and the subsequent exhaustion of the max number of threads that out Tomcat can create (the default = 200).
The last spike saw the creation of 120 new threads in 10 minutes, taking the count from 80 to 200, and blocking out application. The usual count in threads is 70-80.
Doing a thread dump all those threads are in WAITING
state and seems that all are waiting for a new Connection to the DBMS from c3p0.
This is an example stack:
"http-nio-8443-exec-77 Waiting Thread ID: 10016","1"
"java.lang.Object.wait(long)","2"
"com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(long) BasicResourcePool.java:1414","2"
"com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(long) BasicResourcePool.java:606","2"
"com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(long) BasicResourcePool.java:526","2"
"com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse() C3P0PooledConnectionPool.java:755","2"
"com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection() C3P0PooledConnectionPool.java:682","2"
"com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection() AbstractPoolBackedDataSource.java:140","2"
"org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection() AbstractRoutingDataSource.java:164","2"
"org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection() DatasourceConnectionProviderImpl.java:139","2"
"org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection() AbstractSessionImpl.java:380","2"
"org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection() LogicalConnectionImpl.java:228","2"
"org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection() LogicalConnectionImpl.java:171","2"
"org.hibernate.internal.SessionImpl.connection() SessionImpl.java:450","2"
"org.springframework.orm.hibernate4.HibernateTransactionManager.doBegin(Object, TransactionDefinition) HibernateTransactionManager.java:450","2"
"org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(TransactionDefinition) AbstractPlatformTransactionManager.java:373","2"
"org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(PlatformTransactionManager, TransactionAttribute, String) TransactionAspectSupport.java:463","2"
"org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(Method, Class, TransactionAspectSupport$InvocationCallback) TransactionAspectSupport.java:276","2"
"org.springframework.transaction.interceptor.TransactionInterceptor.invoke(MethodInvocation) TransactionInterceptor.java:96","2"
"org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() ReflectiveMethodInvocation.java:179","2"
"org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(MethodInvocation) ExposeInvocationInterceptor.java:92","2"
"org.springframework.aop.framework.ReflectiveMethodInvocation.proceed() ReflectiveMethodInvocation.java:179","2"
"org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(Object, Method, Object[], MethodProxy) CglibAopProxy.java:653","2"
This is our c3p0 configuration for this DataSource
:
<property name="maxIdleTime" value="600" />
<property name="maxPoolSize" value="300" />
<property name="maxStatements" value="200" />
<property name="maxStatementsPerConnection" value="10" />
<property name="minPoolSize" value="15" />
<property name="idleConnectionTestPeriod" value="100" />
<property name="acquireIncrement" value="3" />
<property name="numHelperThreads" value="20" />
I'm currently trying to add a timout parameter to help me debug those cases
<property name="checkoutTimeout" value="30" />
I read that I should use debugUnreturnedConnectionStackTraces
and unreturnedConnectionTimeout
to debug bettere those kind of problems, but I'm currently not understanding them very well and afraid to use them in production.
Is there any other kind of configuration I could try to help me debug the problem? Some other timeout configuration?
Stack:
I tried to put on the database in question the parameter checkoutTimeout:
<property name="checkoutTimeout" value="30" />
This property should force all the threads that are waiting for the checkout of a Connection from c3p0, to give up after 30 seconds in case the pool is exhausted, with the throw of a SqlException.
Not too after putting this property in production we started logging this kind of errors:
Caused by: org.hibernate.exception.GenericJDBCException: Could not open connection
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:235)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:171)
at org.hibernate.internal.SessionImpl.connection(SessionImpl.java:450)
at org.springframework.orm.hibernate4.HibernateTransactionManager.doBegin(HibernateTransactionManager.java:450)
... 119 more
Caused by: java.sql.SQLException: An attempt by a client to checkout a Connection has timed out.
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:118)
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:77)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:687)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:140)
at org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection(AbstractRoutingDataSource.java:164)
at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:139)
at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:380)
at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:228)
... 122 more
Caused by: com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResourcePool@14e83e8f -- timeout at awaitAvailable()
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1416)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:606)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:526)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutAndMarkConnectionInUse(C3P0PooledConnectionPool.java:755)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:682)
... 127 more
So I guess this is a proof that out pool got exhausted (the maxPoolSize is 300). Is that correct?
We were able to solve the problem through the help of JMX and the monitoring of the c3p0 datasources state.
Using JVisualVM (or you can also use JConsole if you don't have an Oracle JDK) we looked at the state of our datasource and this is what it was showed:
Our poolsize is 300, meaning that max 300 connections can be available at any given time by c3p0, and already 95 were busy, and this number was on the rise.
With a bit of trial and error, knowing that this problem only occurred on a particular customer we started testing the new features that this customer used (the problem only recently arised) and we found a particular use case using with we replicated the issue, leaking connections constantly.
In the codebase the problem was that this use case opened a session bnut never took care of closing it.
It was something like this:
this.sessionFactory.withOptions().noInterceptor().openSession();
We added the close of the session and the problem was solved:
session.flush();
session.close();