Search code examples
pythonpython-3.xpython-requestsurllib

python requests keep connection alive for indefinite time


I'm trying to get a python script running which calls an external API (to which I only have read-access) in a certain interval, the API uses cookie-based authentication: Calling the /auth endpoint initially sets session cookies which are then used for authentication in further requests.

As for my problem: Because the authentication is based on an active session, the cookies aren't valid once the connection drops, and therefore has to be restarted. From what I've read, requests is based on urllib3, which keeps the connection alive by default. Yet, after a few tests I noticed that under some circumstances, the connection will be dropped anyway.

I used a Session object from the requests module and I've tested how long it takes for the connection to be dropped as follows:

from requests import session
import logging
import time import time, sleep

logging.basicConfig(level=logging.DEBUG)

def tt(interval):
    credentials = {"username":"user","password":"pass"}
    s = Session()
    r = s.post("https://<host>:<port>/auth", json=credentials)
    ts = time()
    while r.status_code is 200:
        r = s.get("https://<host>:<port>/some/other/endpoint")
        sleep(interval)
    return time() - ts # Seconds until connection drop

Might not be the best way to find that out, but I let that function run twice, once with an interval of 1 second and then with an interval of 1 minute. Both had run for about an hour until I had to manually stop the execution.

However, when I swapped the two lines within the while loop, which meant that there was a 1-minute-delay after the initial POST /auth request, the following GET request failed with a 401 Unauthorized and this message being logged beforehand:

DEBUG:urllib3.connectionpool:Resetting dropped connection: <host>

As the interval of requests may range from a few minutes to multiple hours in my prod script, I have to know beforehand how long these sessions are kept alive and whether there are some exceptions to that rule (like dropping the connection if no request after the initial POST /auth is made for a short while).

So, how long does requests or rather urllib3 keep the connection alive, and is it possible to extend that time indefinitely?

Or is it the server instead of requests that drops the connection?


Solution

  • By using requests.Session, keep-alive is handled for you automatically.

    In the first version of your loop that continuously polls the server after the /auth call is made, the server does not drop the connection due to the subsequent GET that happens. In the second version, it's likely that sleep interval exceeds the amount of time the server is configured to keep the connection open.

    Depending on the server configuration of the API, the response headers may include a Keep-Alive header with information about how long connections are kept open at a minimum. HTTP/1.0 specifies this information is included in the timeout parameter of the Keep-Alive header. You could use this information to determine how long you have until the server will drop the connection.

    In HTTP/1.1, persistent connections are used by default and the Keep-Alive header is not used unless the server explicitly implements it for backwards compatibility. Due to this difference, there isn't an immediate way for a client to determine the exact timeout for connections since it may exist solely as server side configuration.

    The key to keeping the connection open would be to continue polling at regular intervals. The interval you use must be less than the server's configured connection timeout.

    One other thing to point out is that artificially extending the length of the session indefinitely this way makes one more vulnerable to session fixation attacks. You may want to consider adding logic that occasionally reestablishes the session to minimize risk of these types of attacks.