Search code examples
jsonunicodeasciipython-2.x

Forcing Python json module to work with ASCII


I'm using json.dump() and json.load() to save/read a dictionary of strings to/from disk. The issue is that I can't have any of the strings in unicode. They seem to be in unicode no matter how I set the parameters to dump/load (including ensure_ascii and encoding).


Solution

  • If you are just dealing with simple JSON objects, you can use the following:

    def ascii_encode_dict(data):
        ascii_encode = lambda x: x.encode('ascii')
        return dict(map(ascii_encode, pair) for pair in data.items())
    
    json.loads(json_data, object_hook=ascii_encode_dict)
    

    Here is an example of how it works:

    >>> json_data = '{"foo": "bar", "bar": "baz"}'
    >>> json.loads(json_data)                                # old call gives unicode
    {u'foo': u'bar', u'bar': u'baz'}
    >>> json.loads(json_data, object_hook=ascii_encode_dict) # new call gives str
    {'foo': 'bar', 'bar': 'baz'}
    

    This answer works for a more complex JSON structure, and gives some nice explanation on the object_hook parameter. There is also another answer there that recursively takes the result of a json.loads() call and converts all of the Unicode strings to byte strings.