In the following, I am using Play framework 2.4.0 with Scala 2.11.7.
I am stressing a simple play app with Gatling, injecting 5000 users over 60 seconds and in a few seconds, the play server returns the following:
"Failed to accept a connection." and "java.io.IOException: Too many open files in system".
Here is the associated stacktrace:
22:52:48.943 [application-akka.actor.default-dispatcher-12] INFO play.api.Play$ - Application started (Dev)
22:53:08.939 [New I/O server boss #17] WARN o.j.n.c.s.nio.AbstractNioSelector - Failed to accept a connection.
java.io.IOException: Too many open files in system
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[na:1.8.0_45]
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) ~[na:1.8.0_45]
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) ~[na:1.8.0_45]
at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100) [netty-3.10.3.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [netty-3.10.3.Final.jar:na]
at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) [netty-3.10.3.Final.jar:na]
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.10.3.Final.jar:na]
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.10.3.Final.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
I suppose this is due to the ulimit
of the system (would it be possible to confirm that?), and if so, my question is the following:
How this kind of error is managed in production environment? Is this by setting a high value with ulimit -n <high_value>
?
Most foolproof way to check is to:
cat /proc/PID/limits
You will see:
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
...
Max open files 1024 1024 files
...
To see your current shell, you can always:
ulimit -a
And get
...
open files (-n) 1024
...
Changing it is best done system wide via /etc/security/limits.conf But you can use ulimit -n to change for only the current shell.
In terms of how to deal with this situation, there's simply no alternative to having enough file descriptors. Set it high and if it hits, there is a leak or you work for facebook. In production, I believe the general recommendation is 16k.