Search code examples
mysqljsonmysql-8.0

How to group by a key in json datatype column in mysql


I have the following structure for a JSON column order_summary in my orders table,

//Order 1
{
  "total": 16.895,
  .....
  "products": [
    {
      ...,
      "quantity": 2,
      "variant_id": 98
    },
    {
      ...,
      "quantity": 3,
      "variant_id": 99
    },
  ],
}
//Order 2
{
  "total": 10.895,
  .....
  "products": [
    {
      ...,
      "quantity": 2,
      "variant_id": 98
    },
    {
      ...,
      "quantity": 4,
      "variant_id": 100
    },
  ],
}

Now I need to find the total sold quantity for each variant_id

I am looking at something like following

SELECT sum(order_summary->"$.products[*].quantity") 
FROM orders  
GROUP BY order_summary->"$.products[*].variant_id"

But I am just getting zeros. Any help is greatly appreciated.


Solution

  • order_summary->"$.products[*].quantity" results in an array.

    mysql> SELECT order_summary->"$.products[*].quantity" from orders;
    +-----------------------------------------+
    | order_summary->"$.products[*].quantity" |
    +-----------------------------------------+
    | [2, 3]                                  |
    | [2, 4]                                  |
    +-----------------------------------------+
    

    The sum of which is 0.

    mysql> select sum("[2,3]");
    +--------------+
    | sum("[2,3]") |
    +--------------+
    |            0 |
    +--------------+
    

    To make this work, first you have to convert the JSON into rows using json_table. Then they can be aggregated like normal columns.

    select products.*
    from orders,
        json_table(
          -- select the JSON column from orders
          orders.order_summary,
          -- filter only the products
          "$.products[*]"
          -- translate them into columns
          columns(
            quantity int path "$.quantity",
            variant_id int path "$.variant_id"
          )
        ) products
    ;
    
    +----------+------------+
    | quantity | variant_id |
    +----------+------------+
    |        2 |         98 |
    |        3 |         99 |
    |        2 |         98 |
    |        4 |        100 |
    +----------+------------+
    

    Then we can manipulate the products psuedo-table normally.

    select products.variant_id, sum(products.quantity)
    from orders,
        json_table(
          order_summary,
          "$.products[*]"
          columns(
            quantity int path "$.quantity",
            variant_id int path "$.variant_id"
          )
        ) products
    group by products.variant_id;
    
    +------------+------------------------+
    | variant_id | sum(products.quantity) |
    +------------+------------------------+
    |         98 |                      4 |
    |         99 |                      3 |
    |        100 |                      4 |
    +------------+------------------------+