We're having problems with Amazon-hosted Play application with RDS Mysql.
Sometimes the application ends up unresponsive. Today I checked the thread status with jstack -F and there were multiple threads stuck in native code for c3p0 connection :
Thread 14060: (state = IN_NATIVE)
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=150 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=121 (Compiled frame)
- com.mysql.jdbc.util.ReadAheadInputStream.fill(int) @bci=262, line=114 (Compiled frame)
- com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) @bci=176, line=161 (Compiled frame)
- com.mysql.jdbc.util.ReadAheadInputStream.read(byte[], int, int) @bci=48, line=189 (Compiled frame)
- com.mysql.jdbc.MysqlIO.readFully(java.io.InputStream, byte[], int, int) @bci=34, line=3036 (Compiled frame)
- com.mysql.jdbc.MysqlIO.reuseAndReadPacket(com.mysql.jdbc.Buffer, int) @bci=23, line=3489 (Compiled frame)
- com.mysql.jdbc.MysqlIO.reuseAndReadPacket(com.mysql.jdbc.Buffer) @bci=3, line=3478 (Compiled frame)
- com.mysql.jdbc.MysqlIO.checkErrorPacket(int) @bci=12, line=4019 (Compiled frame)
- com.mysql.jdbc.MysqlIO.sendCommand(int, java.lang.String, com.mysql.jdbc.Buffer, boolean, java.lang.String, int) @bci=428, line=2490 (Compiled frame)
- com.mysql.jdbc.MysqlIO.sqlQueryDirect(com.mysql.jdbc.StatementImpl, java.lang.String, java.lang.String, com.mysql.jdbc.Buffer, int, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[]) @bci=559, line=2651 (Compiled frame)
- com.mysql.jdbc.ConnectionImpl.execSQL(com.mysql.jdbc.StatementImpl, java.lang.String, int, com.mysql.jdbc.Buffer, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[], boolean) @bci=130, line=2677 (Compiled frame)
- com.mysql.jdbc.ConnectionImpl.execSQL(com.mysql.jdbc.StatementImpl, java.lang.String, int, com.mysql.jdbc.Buffer, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[]) @bci=17, line=2627 (Compiled frame)
- com.mysql.jdbc.StatementImpl.executeQuery(java.lang.String) @bci=534, line=1556 (Compiled frame)
- com.mysql.jdbc.DatabaseMetaData$9.forEach(java.lang.Object) @bci=174, line=5013 (Compiled frame)
- com.mysql.jdbc.IterateBlock.doForAll() @bci=25, line=51 (Compiled frame)
- com.mysql.jdbc.DatabaseMetaData.getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) @bci=280, line=4991 (Compiled frame)
- com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnectionNoQuery(java.sql.Connection, java.lang.Throwable[]) @bci=21, line=185 (Compiled frame)
- com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(java.sql.Connection, java.lang.String, java.lang.Throwable[]) @bci=7, line=62 (Compiled frame)
- com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(java.sql.Connection) @bci=4, line=67 (Compiled frame)
- com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(java.lang.Object) @bci=122, line=368 (Compiled frame)
- com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(java.lang.Object) @bci=26, line=310 (Compiled frame)
- com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run() @bci=35, line=1999 (Interpreted frame)
- com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run() @bci=41, line=255 (Interpreted frame)
DefaultConnectionTester.activeCheckConnectionNoQuery seems to check DatabaseMetaData.getTables if no other connection testing is set up.
C3p0 manual has instructions for configuring proper connection testing and this answer has instructions for c3p0 settings with Play! but I'm wondering:
We set c3p0.automaticTestTable to have more sensible connection testing but that didn't seem to fix the problem, just changing a bit the stack trace for the threads.
In similar case, we now have thread stack as following:
"Task-Thread-for-com.mchange.v2.async.ThreadPerTaskAsynchronousRunner@6064cd08" daemon prio=10 tid=0x00007f53d403a000 nid=0x63dd runnable [0x00007f53dce4a000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
- locked <0x00000000dbecd0d8> (a com.mysql.jdbc.util.ReadAheadInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3036)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3489)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3478)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4019)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2677)
- locked <0x00000000dbec1a70> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2627)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1556)
- locked <0x00000000dbec1a70> (a com.mysql.jdbc.JDBC4Connection)
- locked <0x00000000d1330078> (a com.mysql.jdbc.StatementImpl)
at com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:73)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:374)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:310)
at com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:1999)
at com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run(ThreadPerTaskAsynchronousRunner.java:255)
Play threads were waiting to get a DB connection:
"play-thread-3" prio=10 tid=0x00007f53c400b000 nid=0x5b3e in Object.wait() [0x00007f53ddd57000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:579)
- locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
... (näitä vaihtelevat määrät)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
- locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
- locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool)
at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:477)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128)
at org.hibernate.ejb.connection.InjectedDataSourceConnectionProvider.getConnection(InjectedDataSourceConnectionProvider.java:71)
at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:446)
at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:167)
at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:160)
at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:81)
at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1473)
at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:60)
at play.db.jpa.JPAPlugin.startTx(JPAPlugin.java:377)
at play.db.jpa.JPAPlugin.beforeInvocation(JPAPlugin.java:345)
at play.plugins.PluginCollection.beforeInvocation(PluginCollection.java:473)
at play.Invoker$Invocation.before(Invoker.java:217)
at play.Invoker$Invocation.run(Invoker.java:277)
at play.server.PlayHandler$NettyInvocation.run(PlayHandler.java:229)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Some notes:
br, Touko
Answering to myself:
We will proceed with setting c3p0.automaticTestTable and check out whether there will be any problems later.
UPDATED : Unfortunately we didn't get rid of problems yet :( Also with automaticTestTable the connection testing seems to have get stuck java.net.SocketInputStream.socketRead0(Native Method)
Answering again to myself:
We ended up trying this "set timeout for connection testing query" approach with c3p0:s maxAdministrativeTaskTime. This is a bit of a hack so got to see whether this can be the final solution (if it works).
So far no problems yet, we'll see how this works. No c3p0 errors / stack traces in the log either, I'd expect this interruption to be shown in the logs? Either timeout hasn't occurred yet or we should tune logging configuration to get c3p0 logs shown.
UPDATE (2013-04-04) : After this the problem (probably) hasn't occurred so often but has occurred every now and then :(
Answering again to myself:
We ended first up using socketTimeout specified for MySQL driver so so that connection test doesn't get stuck.
It seems that the underlying TCP connections ended up so that that there is no answer from the server (TCP ACK or anything).
UPDATE: The final solution was got with giving tcpdump for AWS support, they found out some case within AWS infrastructure. They adviced us to open the ephemeral port range in our EC2 instance's security group. More discussion at AWS forum.