I'm trying to get a python script running which calls an external API (to which I only have read-access) in a certain interval, the API uses cookie-based authentication: Calling the /auth
endpoint initially sets session cookies which are then used for authentication in further requests.
As for my problem: Because the authentication is based on an active session, the cookies aren't valid once the connection drops, and therefore has to be restarted. From what I've read, requests
is based on urllib3
, which keeps the connection alive by default. Yet, after a few tests I noticed that under some circumstances, the connection will be dropped anyway.
I used a Session
object from the requests
module and I've tested how long it takes for the connection to be dropped as follows:
from requests import session
import logging
import time import time, sleep
logging.basicConfig(level=logging.DEBUG)
def tt(interval):
credentials = {"username":"user","password":"pass"}
s = Session()
r = s.post("https://<host>:<port>/auth", json=credentials)
ts = time()
while r.status_code is 200:
r = s.get("https://<host>:<port>/some/other/endpoint")
sleep(interval)
return time() - ts # Seconds until connection drop
Might not be the best way to find that out, but I let that function run twice, once with an interval of 1 second and then with an interval of 1 minute. Both had run for about an hour until I had to manually stop the execution.
However, when I swapped the two lines within the while
loop, which meant that there was a 1-minute-delay after the initial POST /auth
request, the following GET
request failed with a 401 Unauthorized
and this message being logged beforehand:
DEBUG:urllib3.connectionpool:Resetting dropped connection: <host>
As the interval of requests may range from a few minutes to multiple hours in my prod script, I have to know beforehand how long these sessions are kept alive and whether there are some exceptions to that rule (like dropping the connection if no request after the initial POST /auth
is made for a short while).
So, how long does requests
or rather urllib3
keep the connection alive, and is it possible to extend that time indefinitely?
Or is it the server instead of requests
that drops the connection?
By using requests.Session
, keep-alive is handled for you automatically.
In the first version of your loop that continuously polls the server after the /auth
call is made, the server does not drop the connection due to the subsequent GET
that happens. In the second version, it's likely that sleep interval exceeds the amount of time the server is configured to keep the connection open.
Depending on the server configuration of the API, the response headers may include a Keep-Alive
header with information about how long connections are kept open at a minimum. HTTP/1.0
specifies this information is included in the timeout
parameter of the Keep-Alive
header. You could use this information to determine how long you have until the server will drop the connection.
In HTTP/1.1
, persistent connections are used by default and the Keep-Alive
header is not used unless the server explicitly implements it for backwards compatibility. Due to this difference, there isn't an immediate way for a client to determine the exact timeout for connections since it may exist solely as server side configuration.
The key to keeping the connection open would be to continue polling at regular intervals. The interval you use must be less than the server's configured connection timeout.
One other thing to point out is that artificially extending the length of the session indefinitely this way makes one more vulnerable to session fixation attacks. You may want to consider adding logic that occasionally reestablishes the session to minimize risk of these types of attacks.