Search code examples
sslnetworkingopenssludpdtls

Slow DTLS handshake in case of packet loss


I implemented a DTLS server using openssl. (I have an udp socket and I am using a memory bio to communicate with openssl.) However if there is packet loss the DTLS handshake could take 1-2 seconds, which is a lot in my case.

The normal flow when there is no packet loss: (few milliseconds)

Client Hello ------------------------->
             <------------------------- Server Hello
Rest of the handshake ---------------->
                      <---------------- Rest of the handshake

The flow I am experiencing: (few seconds)

Client Hello ------------------------->
             <-------(lost)------------ Server Hello
Client Hello ------------------------->
Client Hello ------------------------->
Client Hello ------------------------->
             <------------------------- Server Hello
Rest of the handshake ---------------->
                      <---------------- Rest of the handshake

I can easily reproduce it even in local environment directly dropping the first Server Hello.

I am curious why the server does not respond to some several upcoming Client Hello's. If it would answer the handshake could complete well below 1 second but this way it takes up 1-2 seconds, when it finally answers again for later Client Hello's.

How could I make the DTLS handshake complete faster? (by enabling answer for all Client Hello's for example, is there a way?)


Solution

  • In OpenSSL the default DTLS rentransmit timer starts at 1 second and periodically doubles that if it has not received a response. In your case the server has sent its ServerHello (and presumably ServerHelloDone and maybe other messages - but you don't show that) and is now waiting for the ClientKeyExchange message. Any subsequent ClientHellos received from the same peer are assumed to be stale retransmits and ignored.

    You can get the server to operate statelessly using DTLSv1_listen() which means that the server will listen for connections from any peer and assume any ClientHello coming in is from a new peer and immediately respond to it - although in this case it adds an additional roundtrip because all ClientHello's are required to have a "cookie" in them. It also wouldn't solve any issues with packet loss later in the handshake - so it doesn't really help your problem here.

    From OpenSSL 1.1.1 you can use a custom DTLS timer callback to control the timeouts on the timer via DTLS_set_timer_cb():

    https://www.openssl.org/docs/man1.1.1/man3/DTLS_set_timer_cb.html

    Here's an example which sets the initial timer to be 50ms and periodically doubles it (50ms is probably too short for a network based implementation - but this particular example is an "in memory" test):

    https://github.com/openssl/openssl/blob/20c98cd45399423f760dbd75d8912769c6b7b10e/test/dtlstest.c#L45-L53

    Be careful not to set the timeout too low because you might start flooding your network with large numbers of spurious retransmits.