Search code examples
httpspython-requestspython-3.6http-status-code-500

500 Internal server error received on python get request, the same url works in browser


I am trying to open and download pdfs using python requests based on urls I get from an API. This works for many of the files, but for files stored at one specific site I get a 500 Internal Server error response. In the respone there is a simple html with only the text: Not Authenticated.

When I paste the same url in Chrome I get the pdf. However I can see a "503 - Failed to load resource" error in the console as it failed to load some icon. Can this be relevant somehow?

The url also works when I run it in Postman with no headers at all.

I have seemingly the same issue as described in this question: python requests http response 500 (site can be reached in browser) However the fix of adding User-Agent to the header of the request does not help. Can there be some other header data required, and is there any way of checking what request my Chrome browser sends?

Update: I logged what request Chrome is sending and copyed the header to my python request. Still the same error. I have tried with our without the same cookie.

Here is my code:

import requests
headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
               'Accept-Encoding': 'gzip, deflate, br',
               'Accept-Language': 'nb,en-GB;q=0.9,en-US;q=0.8,en;q=0.7',
               'Connection': 'keep-alive',
               'Cookie': 'JSESSIONID=a95b392a6d468e2188e73d2c296b; NSC_FS-NL-CET-XFC-IUUQ-8081=ffffffff3d9c37c545525d5f4f58455e445a4a4229a1; JSESSIONID=7b1dd39854eee82b2db41225150e',
               'Host': url.split('/')[2],
               'Upgrade-Insecure-Requests': '1',
               'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'}
response = requests.get(url, headers=headers, verify=True)

I use Python 3.6.3


Solution

  • I found that I only get the error when I run the GET through requests. So I changed to using: urllib.request.urlopen(url)

    More info about this approach here: Download file from web in Python 3