Search code examples
pythonhttpwget

wget.download() function shows HTTP Error 404


when i run this

wget.download("http://downloads.dell.com/FOLDER06808437M/1/7760%20AIO-WIN10-A11-5VNTG.CAB")

It shows this Error Code

File "C:\Program Files\Python39\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)urllib.error.HTTPError: HTTP Error 404: Not Found

But when I run wget http://downloads.dell.com/FOLDER06808437M/1/7760%20AIO-WIN10-A11-5VNTG.CAB manually it works perfectly fine


Solution

  • I investigated wget.download source code and there seems to be bug, piece of source code

    if PY3K:
        # Python 3 can not quote URL as needed
        binurl = list(urlparse.urlsplit(url))
        binurl[2] = urlparse.quote(binurl[2])
        binurl = urlparse.urlunsplit(binurl)
    else:
        binurl = url
    

    so it make assumption URL needs to be quoted that is illegal character like space replaced by codes after % sign, but this was already done as your url contain %20 rather than space. Your URL is altered although it should not

    import urllib.parse as urlparse
    url = "http://downloads.dell.com/FOLDER06808437M/1/7760%20AIO-WIN10-A11-5VNTG.CAB"
    binurl = list(urlparse.urlsplit(url))
    binurl[2] = urlparse.quote(binurl[2])
    binurl = urlparse.urlunsplit(binurl)
    print(binurl) # http://downloads.dell.com/FOLDER06808437M/1/7760%2520AIO-WIN10-A11-5VNTG.CAB
    

    You might either counterweight this issue by providing URL in form which needs escaping, in this case

    import wget
    wget.download("http://downloads.dell.com/FOLDER06808437M/1/7760 AIO-WIN10-A11-5VNTG.CAB")
    

    xor use urllib.request.urlretrieve, most basic form is

    import urllib.request
    urllib.request.urlretrieve("http://downloads.dell.com/FOLDER06808437M/1/7760%20AIO-WIN10-A11-5VNTG.CAB", "776 AIO-WIN10-A11-5VNTG.CAB")
    

    where arguments are URL and filename. Keep in mind that used this way there is not progress indicator (bar), so you need to wait until download complete.