Search code examples
socketsdebuggingnetwork-programmingthrottlingapachebench

'ab' program freezes after lots of requests, why?


Whenever I use 'ab' to benchmark a web server, it will freeze for a while after having sent lots of requests, only to continue after 20 seconds or so.

Consider the following HTTP server simulator, written in Ruby:

require 'socket'

RESPONSE = "HTTP/1.1 200 OK\r\n" +
           "Connection: close\r\n" +
           "\r\n" +
           "\r\n"

buffer = ""
server = TCPServer.new("127.0.0.1", 3000)  # Create TCP server at port 3000.
server.listen(1024)                        # Set backlog to 1024.
while true
    client = server.accept             # Accept new client.
    client.write(RESPONSE)             # Write a stock "HTTP" response.
    client.close_write                 # Shutdown write part of the socket.
    client.read(nil, buffer)           # Read all data from the socket.  
    client.close                       # Close it.
end

I then run ab as follows:

ab -n 45000 -c 10 http://127.0.0.1:3000/

During the first few seconds, ab does its job as it's supposed to and uses 100% CPU:

Benchmarking 127.0.0.1 (be patient)
Completed 4500 requests
Completed 9000 requests
Completed 13500 requests

After about 13500 requests, system CPU usage drops to 0%. ab seems to be frozen on something. The problem is not in the server because at this moment, the server is calling accept(). After about 20 seconds ab continues as if nothing happened, and will use 100% CPU again, only to freeze again after several seconds.

I suspect something in the kernel is throttling connections, but what and why? I'm using OS X Leopard. I've seen similar behavior on Linux as well, though the freeze happens at a much larger number of requests and doesn't happen so often.

This problem prevents me from running large HTTP benchmarks.


Solution

  • It sounds like you are running out of ephemeral ports. To check, use the netstat command and look for several thousand ports in the TIME_WAIT state.

    On Mac OS X the default ephemeral port range is 49152 to 65535, for a total of 16384 ports. You can check this with the sysctl command:

    $ sysctl net.inet.ip.portrange.first net.inet.ip.portrange.last
    net.inet.ip.portrange.first: 49152
    net.inet.ip.portrange.last: 65535
    

    Once you run out of ephemeral ports, you will normally need to wait until the TIME_WAIT state expires (2 * maximum segment lifetime) until you can reuse a particular port number. You can double the number of ports by changing the range to start at 32768, which is the default on Linux and Solaris. (The maximum port number is 65535 so you cannot increase the high end.)

    $ sudo sysctl -w net.inet.ip.portrange.first=32768
    net.inet.ip.portrange.first: 49152 -> 32768
    

    Note that the official range designated by IANA is 49152 to 65535, and some firewalls may assume that dynamically assigned ports fall within that range. You may need to reconfigure your firewall in order to make use of a larger range outside of your local network.

    It is also possible to reduce the maximum segment lifetime (sysctl net.inet.tcp.msl on Mac OS X), which controls the duration of the TIME_WAIT state, but this is dangerous as it could cause older connections to get mixed up with newer ones that are using the same port number. There are also some tricks involving binding to specific ports with the SO_REUSEADDR option, or closing with the SO_LINGER option, but those also could cause old and new connections to be mixed up, so are generally considered to be bad ideas.