Search code examples
python-3.xpython-requestshttpconnection

Requests HTTPConnectionPool Read timeout never recovers


I have a script that runs 24/7 and is sometimes killed by the system-reboot. One portion of the scripts collects bins from pastebin[.]com with certain contents and the other one exports them to remote rest endpoint. The part where I collect bins sends a lot of requests and never bumps into the issues with HTTPConnectionPool, while the other part tends to run into it pretty quickly despite the fact it sends request much less often.

I have following code with retry-logic, so I ensure the bin gets exported to remote

def send_export_request(self, payload):
    while True:
        success = False
        try:
            self.session.post(self.collector, data=payload, timeout=10)
            success = True
        except requests.exceptions.RequestException as e:
            self.logger.log_error("RequestException ocurred when storing paste %s: %s" % (payload['key'], e))

        if success:
            break

        self.logger.log("Retrying to store the paste...")
        self.session.close()
        self.session = requests.session()
        sleep(2)

Of course self.session is initialized in constructor to requests.session(). What eventually always happens (the amount of time differs from case to case, but it always happens under 24 hours) is that the following exception is raised:

HTTPConnectionPool(host='www.[redacted].com', port=80): Read timed out. (read timeout=10)

And the code goes into the loop, always raising this exception, logging it, waiting 2 seconds, trying again, raising the exception and so on and so forth. It never recovers, unless I kill the script and run it again. I searched a lot, tried originally the code without a session (just post requests), then added the session and finally tried creating new session before retrying. None of that works. What am I missing?


Solution

  • No wonder no one knew where the issue lies. I will answer this question to shed some light on what the problem was.

    I did some further testing: The remote server to which I was posting the contents of bins had some sort of a IPS or similar system enabled. Collector is not (on purpose) behind HTTPS, so payload inspection was possible and when payload contained some keywords, or known signatures, remote server decided to let the connection timeout.

    As not having the requests behind HTTPS is crucial for my use case (traffic sniffing and inspection must be possible to anyone), I figured a workaround: if request is killed by remote server, I base64 encode its body before retrying and then it works.