Search code examples
mysqlmysql-json

Select rows with specific value in JSON column


I have a column in MySQL which has an array of JSON with multi objects, this is one of its values:

[
    {
        "date": "2014-01-17T00:00:00.000Z",
        "price": "56.89845204047926330304107978008687496185302734375",
        "total": "105774.2223432509504803533673",
        "amount": 1859,
        "symbol": "fb",
        "transaction_code": "buy"
    },
    {
        "date": "2015-04-20T00:00:00.000Z",
        "price": "121.497274059658224132363102398812770843505859375",
        "total": "962865.8969227914262489775865",
        "amount": 7925,
        "symbol": "aapl",
        "transaction_code": "buy"
    },
    {
        "date": "2013-11-06T00:00:00.000Z",
        "price": "49.18047492858931235559794004075229167938232421875",
        "total": "330738.6938947631255913961468",
        "amount": 6725,
        "symbol": "fb",
        "transaction_code": "sell"
    }
]

I want to extract objects with "transaction_code": "sell" and find the average price of them.


Solution

  • In MySQL 8.0, you can use the JSON_TABLE() function to transform the data so you can apply a WHERE clause to it.

    SELECT AVG(j.price)
    FROM mytable CROSS JOIN JSON_TABLE(mytable.example_json, '$[*]' COLUMNS(
        date DATETIME PATH '$.date',
        price DOUBLE PATH '$.price',
        total DOUBLE PATH '$.total',
        amount INT PATH '$.amount',
        symbol VARCHAR(10) PATH '$.symbol',
        transaction_code VARCHAR(10) PATH '$.transaction_code'
      )
    ) AS j
    WHERE j.transaction_code = 'sell';
    

    This works all right, but it will be bound to have poor performance, and it's more work to write the code every time you want to do this.

    In versions of MySQL before 8.0, you're out of luck. Fetch the whole JSON document into a client application, deserialize it into a data structure, and write custom code to loop over it.

    It would be better to talk to whomever decided to use JSON to store this data, and convince them to use normal rows and columns. The queries will be easier, and you can create indexes to optimize them.

    Also use the DECIMAL type for currency values, not DOUBLE. No one wants to make exact change for 56.89845204047926330304107978008687496185302734375!