Search code examples
pythonjsonutf-8flask

python jsonify dictionary in utf-8


I want to get json data into utf-8

I have a list my_list = []

and then many appends unicode values to the list like this

my_list.append(u'ტესტ')

return jsonify(result=my_list)

and it gets

{
"result": [
"\u10e2\u10d4\u10e1\u10e2",
"\u10e2\u10dd\u10db\u10d0\u10e8\u10d5\u10d8\u10da\u10d8"
]
}

Solution

  • Use the standard-library json module instead, and set the ensure_ascii keyword parameter to False when encoding, or do the same with flask.json.dumps():

    >>> data = u'\u10e2\u10d4\u10e1\u10e2'
    >>> import json
    >>> json.dumps(data)
    '"\\u10e2\\u10d4\\u10e1\\u10e2"'
    >>> json.dumps(data, ensure_ascii=False)
    u'"\u10e2\u10d4\u10e1\u10e2"'
    >>> print json.dumps(data, ensure_ascii=False)
    "ტესტ"
    >>> json.dumps(data, ensure_ascii=False).encode('utf8')
    '"\xe1\x83\xa2\xe1\x83\x94\xe1\x83\xa1\xe1\x83\xa2"'
    

    Note that you still need to explicitly encode the result to UTF8 because the dumps() function returns a unicode object in that case.

    You can make this the default (and use jsonify() again) by setting JSON_AS_ASCII to False in your Flask app config.

    WARNING: do not include untrusted data in JSON that is not ASCII-safe, and then interpolate into a HTML template or use in a JSONP API, as you can cause syntax errors or open a cross-site scripting vulnerability this way. That's because JSON is not a strict subset of Javascript, and when disabling ASCII-safe encoding the U+2028 and U+2029 separators will not be escaped to \u2028 and \u2029 sequences.