Search code examples
python-3.xweb-scrapingpython-requestshttp-status-code-404http-status-code-200

Why does valid PDF link returns status code 404?


I am trying to access the status_code of a valid url, however a 404 is returned although the link exists. Here is the line of code that returns me a 404:

print(requests.get("https://www.moh.gov.sg/docs/librariesprovider5/local-situation-report/situation-report-21-jul-2020.pdf").status_code)

This is the link of the PDF that i am trying to access: https://www.moh.gov.sg/docs/librariesprovider5/local-situation-report/situation-report-21-jul-2020.pdf

Is anyone able to explain to me why i get a 404 from trying to access a valid url? Thank You.


Solution

  • It is giving status code 404 not found because you didn't pass the mandatory header Accept while requesting. Accept request-header field is to specify certain media types which are acceptable for the response.

    print(requests.get("https://www.moh.gov.sg/docs/librariesprovider5/local-situation-report/situation-report-21-jul-2020.pdf",
                   headers={"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3"},
                   verify = False))
    

    Try the above code you will get status code 200

    Added an image of google developer network section for more clarity. enter image description here