Search code examples
pythonxmlapiencodingquickbase

Quickbase API returns data in CP1252 encoding but says it's returning UTF-8


I'm having trouble with encodings when calling the Quickbase API using Python. I call the API to get a record, and specify the encoding parameter in the request as "UTF-8". The XML response I get back from Quickbase says it's UTF-8, because the XML starts with:

<?xml version="1.0" encoding="utf-8" ?>

However, the XML bytes are actually in encoded as CP1252. I've confirmed this because a right single quotation mark (Unicode char U+2019) is being encoded as the byte 0x92 (CP1252) rather than the UTF-8 byte sequence 0xE2 0x80 0x99. Any idea why Quickbase is saying the XML response is one encoding (UTF-8) but actually using another (CP1252)?

Note that I'm also passing a "Accept-Charset: utf-8" header in the request, but that has no effect.


Solution

  • Any idea why Quickbase is saying the XML response is one encoding (UTF-8) but actually using another (CP1252)?

    Probably because a Quickbase developer copied-and-pasted the XML declaration without actually understanding what encoding means.

    The easiest workaround is to use xml_response = xml_response.decode('windows-1252').encode('UTF-8') to get a real UTF-8 string to pass to the XML parser.