Search code examples
csocketstcptcp-keepalive

TCP KEEPALIVE not working as expected on Linux


I have a simple HTTP/WebSocket server/client application with the following setup:

  • Server is writen in C and running on Linux
  • Client is a browser (Google Chrome) on a different computer, and it connects to the server and talks to server over WebSocket
  • When server accepts the connection cfd from client, it sets cfd to be non-blocking, and sets its keepalive like this
    int flags =1;
    if (setsockopt(cfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&flags, sizeof(flags))) { perror("ERROR: setsocketopt(), SO_KEEPALIVE"); exit(0); };
  • After client connects to the server, I unplugged the client computer's network cable. And server keeps sending data to client over the TCP socket cfd (non-blocking)
  • With client computer's cable still unplugged, I run netstat to watch the state of the connection between the server and the client. I saw it remains in "ESTABLISHED" state for around 960 seconds.

Now my question is where the 960 seconds could come from. I thought it is controlled by net.ipv4.tcp_keepalive_time, HOWEVER, no matter how I changed its value by sudo sysctl -w net.ipv4.tcp_keepalive_time=XXX, the TCP socket remains in ESTABLISHED state for 960 seconds, instead of the value of net.ipv4.tcp_keepalive_time


Solution

  • I believe you are looking for net.ipv4.tcp_retries2 which controls how many times Linux will continue to retry before giving up and closing the connection. It defaults to 15 and given that the maximum retransmission timeout is 60 seconds, your 960 second observed time is in line with what is expected.

    If you reduce tcp_retries2, it will retransmit fewer times and close the connection faster.

    You may also be interested in looking at some of the answers to this StackOverflow question