Search code examples
mysqlmysql-8.0mysql-json

How to extract and flatten and array that is inside of JSON objects in MySQL 8?


Given some json documents like this:

{
   "_id":"00006073",
    "subscribersIds":[
      170968,
      225647
   ]
}
-----------------------------------
{
   "_id":"00006072",
   "subscribersIds":[
      170968
   ]
}
--------------------------------
{
   "_id":"00006074,
   "subscribersIds":[
      228195,
      225647
   ]
}

Do you know how can I get a list of the subscribersIds, without repetition? The result should be something like this 170968, 225647, 228195, because I will need to use the result of this query as a condition of another one.

For Couchebase there is the "UNNEST" command that does it, however I didn't find the properly way to do it in MySQL 8, as SELECT DISTINCT doc ->> '$.subscribersIds[*]' FROM customers will return [170968,225647],[170968],[228195,225647]

Thanks in advance!


Solution

  • mysql> select * from mytable;
    +----------+------------------+
    | _id      | subscriberIds    |
    +----------+------------------+
    | 00006072 | [170968]         |
    | 00006073 | [170968, 225647] |
    | 00006074 | [228195, 225647] |
    +----------+------------------+
    
    mysql> select j.subscriberId from mytable, 
      json_table(mytable.subscriberIds, '$[*]' columns (subscriberId int path '$')) j;
    +--------------+
    | subscriberId |
    +--------------+
    |       170968 |
    |       170968 |
    |       225647 |
    |       228195 |
    |       225647 |
    +--------------+
    
    mysql> select distinct j.subscriberId from mytable,
      json_table(mytable.subscriberIds, '$[*]' columns (subscriberId int path '$')) j;
    +--------------+
    | subscriberId |
    +--------------+
    |       170968 |
    |       225647 |
    |       228195 |
    +--------------+
    

    This is a fairly complex query to write every time you want to get the set of distinct subscriberIds.

    It would be much easier if you did not use JSON at all, but stored the id's in a normalized manner, one per row in a second table.

    mysql> create table mySubscribers (_id char(8), subscriberId int, primary key (_id, subscriberId));
    
    mysql> insert into mySubscribers (_id, subscriberId) select _id, subscriberId from mytable, json_table(subscriberIds, '$[*]' columns (subscriberId int path '$')) j;
    Records: 5  Duplicates: 0  Warnings: 0
    
    mysql> select * from mySubscribers;
    +----------+--------------+
    | _id      | subscriberId |
    +----------+--------------+
    | 00006072 |       170968 |
    | 00006073 |       170968 |
    | 00006073 |       225647 |
    | 00006074 |       225647 |
    | 00006074 |       228195 |
    +----------+--------------+
    
    mysql> select distinct subscriberId from mySubscribers;
    +--------------+
    | subscriberId |
    +--------------+
    |       170968 |
    |       225647 |
    |       228195 |
    +--------------+