Search code examples
pythonjsonpymysql

How to query a JSON MySQL column into a dict?


My database looks like this:

CREATE TABLE my_table( irrelevant_column TEXT, json_stuff JSON );
mysql> SELECT * FROM my_table;

+------------------------+-----------------------------------------------+
| irrelevant_column      | json_stuff                                    |
+------------------------+-----------------------------------------------+
| blah blah              | { 'this': 'is my json data' }                 |
| more unrelated stuff   | { 'more': 'of my data in the correct format' }|
+------------------------+-----------------------------------------------+

I am looking for an elegant way to query JSON data from MySQL into a list of dicts.

I have tried using a DictCursor

cursor = connection.cursor( pymysql.cursors.DictCursor )
cursor.execute( "SELECT json_stuff FROM my_table" )
rows = cursor.fetchall()

but it doesn't properly handle the JSON column type. It returns this:

[
  { "json_stuff": "<json_stuff row 1 in string format>" },
  { "json_stuff": "<json_stuff row 2 in string format>" }, 
  etc.
]

I would like

[
  { 'this': 'is my json data' },
  { 'more': 'of my data in the correct format' }, 
  etc.
]

This ugly and inefficient code works.

def get_list_of_stuff():
    cursor = connection.cursor()
    cursor.execute( "SELECT json_stuff FROM my_table" )
    rows = cursor.fetchall()
    return [ json.loads( row["json_stuff"] ) for row in rows ]

Does anyone know a way to do this without looping through all of the rows and parsing each one into JSON?


Solution

  • The MySQL developers apparently don't consider it appropriate for drivers to automatically convert between the JSON data type and Python structured types. A bug report JSON columns should accept Python dicts as input was submitted a couple of years ago. It was closed as "Not a bug", with the comment:

    Because JSON is not a built-in type (or even type in Python). This kind of conversions shouldn't be made in a driver, is better suited for frameworks that have an high level of abstraction.

    Since you're not using a framework, just a low-level database API, you need to do the parsing yourself.

    Personally, I disagree with their reasoning, and I think your expectation is reasonable. But that's the way it is.