Search code examples
csocketsserversocket

Is there any approach to update and restart a server keeping its socket in a "suspended" state?


There's a program listening and answering requests (proprietary binary protocol) in a TCP/IP port. But this program needs to be updated and so it needs to be restarted and then it can continue doing its work on the same port.

According to its protocol, all current connections can be closed because all clients will re-establish new connections right after they are closed, but new connections should be retained (but not denied) until the program has been restarted (for a few seconds), how could it be done?

So, as soon as it is running again all retained connections on a given port could be released to reach the listening socket.

Let's imagine the following steps:

  1. A server program is running and listening to a given port, let's say port A.
  2. It asks to an external resource (like the operational system or any third party module) to retain all connections coming to the port A.
  3. It closes all current connections that is currently established to the port A - IT MIGHT TAKE TIME (maybe a couple of minutes, because it will finish first all requested services)
  4. It's restarted and a brand new executable comes to life and starts to listen to the port A.
  5. It asks to the external resource to release all retained connections, so they can now reach the port A, that is now ready to receive connections.

The steps 2 and 4 are just assumptions.


Solution

  • In POSIXy systems (Linux, Mac, BSDs) there is a rather simple, but clever way for the service process to achieve this. It does not even need any privileges to do so.

    The core idea is very simple: When the service knows it will restart, it will create a detached child process (in a new session and process group, so it'll be reparented to init) holding the listening socket(s). Then, the parent will simply no longer accept() any new connections, finish any incomplete responses, and re-execute itself with the updated binary.

    The holder process will also listen for incoming connections on an Unix domain (stream or seqpacket; connection-oriented) socket. The updated server instance will connect to the holder process, with an ancillary payload of SCM_CREDENTIALS, which includes kernel-verified user and group the process runs as, and process ID that the holder process can use to examine if the connecting party is an updated version of the binary. (In Linux, this can be done by comparing the stat()s of /proc/PID/exe and the expected executable.) If the other end is authorized, the holder transfers the listening socket descriptors back, using SCM_RIGHTS ancillary payload. Finally, the updated service sends a final thank you, that tells the holder process to exit (which also closes its copies of the listening socket descriptors).

    As long as the backlog (see listen()) is sufficient (or syncookies enabled in Linux, which makes the backlog essentially unlimited), this should be quite robust approach.

    If desired, I can provide example code on how this would work in Linux. (I consider the security aspects critical here, so I would definitely do Linux-only stuff, like examining /proc/PID/exe, to verify that only the updated binary can re-acquire the listening sockets.)