Search code examples
pythonurllib

How to handle urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>?


I have following small piece of python script:

from urllib.request import urlopen

def download_file(url):
    fp = open(url.split("=")[-1] + ".pdf", 'wb')
    req = urlopen(url)
    CHUNK = 20480
    chunk = req.read(CHUNK)
    fp.write(chunk)
    fp.close()

for i in range(1, 10000, 1):
    download_file("__some__url__" + str(i))

print("Done.")

I keep this script running but after sometime(let's say after downloading 100 files) due to some reason it gives an error: urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>

How can I modify my code to handle that error i.e it shouldn't stop script and handle i.e wait for connection to restore and then resume downloading where it left?

PS: I know it only downloads 20KB from URL.


Solution

  • For possible causes for this error, you could have a look at python: [Errno 10054] An existing connection was forcibly closed by the remote host. My conclusion after reading the highest vote answers is just that it can happen and your code should be prepared to it.

    I would use a try: ... except ... block in a loop here and and add a increasing delay before retrying a failed connection:

    def download_file(url):
        # prefere with when exceptions are expected
        with open(url.split("=")[-1] + ".pdf", 'wb') as fp:
            delay = 5
            max_retries = 3
            for _ in range(max_retries):
                try:
                    req = urlopen(url)
                    CHUNK = 20480
                    chunk = req.read(CHUNK)
                    fp.write(chunk)
                    break                # do not loop after a successfull download...
                except urllib.error.URLError:
                    time.sleep(delay)
                    delay *= 2
            else:                # signal an abort if download was not possible
                print(f"Failed for {url} after {max_retries} attempt")
                # or if you want to abort the script
                # raise Exception(f"Failed for {url} after {max_retries} attempt")