Let's assume, you have an 24/7 server running on a linux machine, that handles incoming connections, as well as "plain" TCP as TLS (via OpenSSL). To ensure the service to work, the clients are required to always maintain a connection to this service. Unfortunately, some of those clients don't reconnect immediately when the server closes the connection, so the server tries its best to keep the connection alive forever.
However, if the server needs to get rebooted, e.g. due to maintenance, the connections will be lost.
To avoid disconnecting, I want to "move" established TCP sessions to another machine using the TCP_REPAIR mechanism (https://lwn.net/Articles/495304/). Basically this means to save the TCP socket information from machine A (such as the Syn / Ack numbers), recover the TCP socket information on machine B and ensure that new IP-packets will be sent to the new machine.
This works fairly well with plain TCP without the clients noticing, that the TCP-connection is being redirected to another machine. But when using TLS, this obviously requires some more work.
To simplify, let's assume, that there are no TLS messages on the wire, no SSL_read and SSL_write is pending, and all previous TLS messages were sent and received completely.
What I tried so far:
Approach 1: Silently re-create a new SSL object using the same SSL_SESSION
Try to create a new SSL object, make it "established" and attach it to the fd:
This didn't work at all. I always see a full handshake, SSL_session_reused returns 0 for both SSL objects. And even if the SSL_SESSION was reused, I still doubt, that this would be sufficient.
Approach 2: The memcpy approach
This is basically an attempt to create something like "i2d_SSL" and "d2i_SSL" methods.
I suppose, Approach 2 could work, but it's pretty hard to find out the really relevant fields - if this can work at all.
Can anyone shed some light on this?
I have been asked this question from time to time over the years.
The answer is that OpenSSL does not support this. An SSL object contains a lot of temporary connection specific state, so approach 1 is doomed to failure. For approach 2, there are many dependent sub objects inside the SSL, e.g. for tracking cipher and hash states hidden behind the top level SSL object. You would need to replicate all of these. All of those objects will have references to the specific libssl/libcrypto instance (such as the loaded provider objects). This is unfortunately also doomed to failure.
This would be a major feature to add to OpenSSL requiring a lot of effort to make this work. It is simply not possible without major changes to the underlying libraries.