Python's urllib.request.urlopen()
will raise an exception if the HTTP status code of the request is not OK (e.g., 404).
This is because the default opener uses the HTTPDefaultErrorHandler
class:
A class which defines a default handler for HTTP error responses; all responses are turned into
HTTPError
exceptions.
Even if you build your own opener, it (un)helpfully includes the HTTPDefaultErrorHandler
for you implicitly.
If, however, you don't want Python to raise an exception if you get a non-OK response, it's unclear how to disable this behavior.
If you build your own opener with build_opener()
, the documentation notes, emphasis added,
Instances of the following classes will be in front of the handlers, unless the handlers contain them, instances of them or subclasses of them: ...
HTTPDefaultErrorHandler
...
Therefore, we need to make our own subclass of HTTPDefaultErrorHandler
that does not raise an exception and simply passes the response through the pipeline unmodified. Then build_opener()
will use our error handler instead of the default one.
import urllib.request
class NonRaisingHTTPErrorProcessor(urllib.request.HTTPErrorProcessor):
http_response = https_response = lambda self, request, response: response
opener = urllib.request.build_opener(NonRaisingHTTPErrorProcessor)
response = opener.open('http://example.com/doesnt-exist')
print(response.status) # prints 404
This answer (including the code sample) was not written by ChatGPT, but it did point out the solution.