Search code examples
pythonjsonpostgresqlutf-8psycopg2

Psycopg2 Json UTF8


In Postgres I have column of json type with UTF-8 encoding. I am inserting json from Python with Psycopg2

cursor.execute("INSERT INTO my_table (json_data) VALUES (%s)",Json(data))

The result in the database is

{"item": "r\u00f6b"}

Meaning ö appears as \u00f6. When I manually insert this item through pgAdmin, it appears normally as ö. The documentation says that psycopg2.extras.Json uses dumps inside, which has this encoding issue. Could you suggest a solution to this to allow storing {"item": "röb"} without \u ?


Solution

  • This is due to json.dumps having the default of ensure_ascii=True per json:

    If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is false, these characters will be output as-is. ...

    As permitted, though not required, by the RFC, this module’s serializer sets ensure_ascii=True by default, thus escaping the output so that the resulting strings only contain ASCII characters.

    The solution is to override the default by creating a lambda function that uses dumps(ensure_ascii=False). Derived from examples here psycopg2 json.

    import json
    from psycopg2.extras import Json 
    
    json.dumps('ö')                                                                                                                                                            
    '"\\u00f6"'
    
    json.dumps('ö', ensure_ascii=False)                                                                                                                                        
    '"ö"'
    
    dumps_lambda = lambda x: json.dumps(x, ensure_ascii=False)
    
    j = Json('ö', dumps=dumps_lambda)                                                                                                                                         
    
    j.adapted                                                                                                                                                                 
    'ö'