Search code examples
pythonhttpx

What exceptions can be raised by Python HTTPX's json() method?


The excellent Python HTTPX package has a .json() method for conveniently decoding resposnes that are in JSON format. But the documentation does not mention what exceptions can be raised by .json(), and even looking through the code, it is not obvious to me what the possible exceptions might be.

For purposes of writing code like the following,

try:
    return response.json()
except EXCEPTION:
    print('This is not the JSON you are looking for')

what exceptions should I test for?


Solution

  • even looking through the code, it is not obvious to me what the possible exceptions might be.

    Here's my attempt at looking through it:

    def json(self, **kwargs: typing.Any) -> typing.Any:
        if self.charset_encoding is None and self.content and len(self.content) > 3:
            encoding = guess_json_utf(self.content)
            if encoding is not None:
                return jsonlib.loads(self.content.decode(encoding), **kwargs)
        return jsonlib.loads(self.text, **kwargs)
    
    • self.content can raise ResponseNotRead if (as it sounds like) the response has not been read yet. However, this is almost certainly due to a logical error in the code rather than any meaningful problem at runtime worth detecting; so there is no good reason to catch this. (It also wouldn't happen with the most straightforward use cases, such as the one shown in the documentation.) Otherwise, self.content will be a bytes, so len will work.

    • self.charset_encoding will either return None (if there is no corresponding data in the response header) or else eventually use the standard library email.message.Message.get_content_charset to parse a content type from the response header. The latter is not documented to raise any exceptions; so there should not be any exception from accessing this value.

    • guess_json_utf will necessarily be passed a bytes that is at least 4 bytes long. Nothing in its logic should be able to fail under those conditions. Either None or a string will be returned.

    • If jsonlib.loads is called using self.content.decode(encoding), then encoding was necessarily not None, and is a valid encoding name returned from guess_json_utf However, it's possible that the self.content (bytes data) is not valid data (i.e., convertible to text using the guessed text encoding). This would cause UnicodeDecodeError.

    • Otherwise, it's called with self.text. If the underlying _text was set before, it will be a string and gets returned. If there is no content in the response, an empty string is returned. Otherwise, decoding is attempted using a default text encoder with a "replace" error-handling policy. There again shouldn't be anything that could cause the text encoding name to be invalid, so again this can only raise UnicodeDecodeError, and even that shouldn't be possible with the "replace" error-handling policy.

    • Finally: jsonlib.loads is simply json.loads (i.e., the standard library json). if the code gets this far, json.loads will definitely be given a string to load; all possible issues with the string contents will be reported as JSONDecodeError.


    tl;dr: the possible exceptions that make sense to catch are JSONDecodeError (the response is not valid JSON - this includes e.g. an empty response) and UnicodeDecodeError (the response is corrupt in some way, for example it mixes bytes intended to encode text in two different ways, or it's supposed to be UTF-8 but contains bytes that are illegal in that encoding; or it's encoded using a non-UTF scheme, such as Latin-1, in a way that is incompatible with the corresponding guessed UTF scheme, and doesn't advertise the encoding in the header).