I'm using urllib2 to request a particular S3 bucket at hxxp://s3.amazonaws.com/mybucket
. Amazon sends back an HTTP code of 301 along with some XML data (the redirect being to hxxp://mybucket.s3.amazonaws.com/
). Instead of following the redirect, python raises urllib2.HTTPError: HTTP Error 301: Moved Permanently
.
According to the official Python docs at HOWTO Fetch Internet Resources Using urllib2, "the default handlers handle redirects (codes in the 300 range)".
Is python handling this incorrectly (presumably because of the unexpected XML in the response), or am I doing something wrong? I've watched in Wireshark and the response comes back exactly the same to python's request as it does to me using a web client. In debugging, I don't see the XML being captured anywhere in the response object.
Thanks for any guidance.
Edit: Sorry for not posting the code initially. It's nothing special, literally just this -
import urllib2, httplib
request = urllib2.Request(site)
response = urllib2.urlopen(request)
You are better off using the requests
library. requests
handle redirection by default : http://docs.python-requests.org/en/latest/user/quickstart/#redirection-and-history
import requests
response = requests.get(site)
print(response.content)
I don't get the problem with urllib2, I tried to look into the documentation https://docs.python.org/2/library/urllib2.html but it doesn't look intuitive.
It seems that in Python3, they refactored it to make it less a burden to use, but I am still convinced that requests
is the way to go.
Note The urllib2 module has been split across several modules in Python 3 named urllib.request and urllib.error. The 2to3 tool will automatically adapt imports when converting your sources to Python 3.