Search code examples
mysqlarraysmysql-8.0mysql-json

How to Update an element from a JSON Array based on 2 conditions in Mysql 8.0.26


I have the following json data in a mysql column:

{
"components":[
      {
         "url":"www.google.com",
         "order":3,
         "accountId":"123",
         "autoPilot":true,
      },
      {
         "url":"www.youtube.com",
         "order":7,
         "accountId":"987",
         "autoPilot":false,
      },
   "addTabs":true,
   "activateHomeSection":true 
}

I need to update the url attribute based on accountId and autopilot attribute.

For example I pass:

  • accountId = 123,
  • autopilot = true,
  • url = 'facebook.com'

Result: AccountId and autopilot matches -> www.google.com url changes to www.facebook.com

{
"components":[ 
         {
         "url":"www.google.com",
         "order":3,
         "accountId":"123",
         "autoPilot":true
         },
 ...

Till this point I managed to write this code:

select j.url from users, json_table(custom_page, '$.components[*]' columns (
    autoPilot boolean path '$.autoPilot',
    accountId varchar(36) path '$.accountId',
    url TEXT path '$.url')) as j
where username='mihai2nd' AND j.autoPilot = true and j.accountId='123'; 

I was planning to retrieve first the url, and then use a regex function to update it. But such way is not good in performance, as I will need to send 2 queries. I tried also other methods with JSON_SEARCH / JSON_CONTAINS / JSON_EXTRACT, but failed.

Do you know guys a nice way to achieve this, without affecting the performance ?


Solution

  • The best way to do this with good performance is to store the attributes in a normal table with rows and columns, instead of storing them in JSON. If you store the data in JSON, it resists being optimized with indexes.

    MySQL 8.0.17 and later support Multi-Valued Indexes which allows you to create an index over a JSON array. Unfortunately, you can only reference one column as a multi-valued index, so you need to choose which one of your attributes is most likely to help improve the performance if it is searchable via an index lookup. Let's assume it's accountId.

    mysql> create table user (custom_page json);
    
    mysql> alter table user add index bk1 ((CAST(custom_page->'$.components[*].accountId' AS UNSIGNED ARRAY)));
    
    mysql> explain select * from user where 12 member of (custom_page->'$.components[*].accountId');
    +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
    | id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
    +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
    |  1 | SIMPLE      | user  | NULL       | ref  | bk1           | bk1  | 9       | const |    1 |   100.00 | Using where |
    +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
    

    You can see in the above example that the optimizer will use the index I created when I use the MEMBER OF() predicate to search.

    That will at least narrow down the search to rows where the value is found. Then you can use your JSON_TABLE() approach to extract exactly which entry has the attributes you want, knowing that the search will be limited to the rows that matched your MEMBER OF() predicate.

    select user.id, j.* from user 
    cross join json_table(custom_page, '$.components[*]' columns(
      rownum for ordinality, 
      autoPilot boolean path '$.autoPilot', 
      accountId varchar(36) path '$.accountId', 
      url text path '$.url')
    ) as j 
    where 123 member of (custom_page->'$.components[*].accountId');
    
    +----+--------+-----------+-----------+-----------------+
    | id | rownum | autoPilot | accountId | url             |
    +----+--------+-----------+-----------+-----------------+
    |  1 |      1 |         1 | 123       | www.google.com  |
    |  1 |      2 |         0 | 987       | www.youtube.com |
    +----+--------+-----------+-----------+-----------------+
    

    Now you can filter these generated rows:

    select user.id, j.* from user
    cross join json_table(custom_page, '$.components[*]' columns(
      rownum for ordinality, 
      autoPilot boolean path '$.autoPilot',
      accountId varchar(36) path '$.accountId',
      url text path '$.url')
    ) as j
    where 123 member of (custom_page->'$.components[*].accountId')
    and j.autoPilot = true
    and j.accountId = 123;
    
    +----+--------+-----------+-----------+----------------+
    | id | rownum | autoPilot | accountId | url            |
    +----+--------+-----------+-----------+----------------+
    |  1 |      1 |         1 | 123       | www.google.com |
    +----+--------+-----------+-----------+----------------+
    

    That's SELECT, but you need to UPDATE.

    with cte as (
      select user.id, j.* from user
      cross join json_table(custom_page, '$.components[*]' columns(
        rownum for ordinality, 
        autoPilot boolean path '$.autoPilot',
        accountId varchar(36) path '$.accountId',
        url text path '$.url')
      ) as j 
      where 123 member of (custom_page->'$.components[*].accountId')
      and j.autoPilot = true
      and j.accountId = 123
    ) 
    update user cross join cte
    set custom_page = json_set(custom_page, concat('$.components[', cte.rownum-1, '].url'), 'facebook.com')
    where cte.id = user.id;
    
    Query OK, 1 row affected (0.00 sec)
    Rows matched: 1  Changed: 1  Warnings: 0
    

    Then check that it updated the way you intended:

    mysql> select id, json_pretty(custom_page) from user;
    +----+--------------------------+
    | id | json_pretty(custom_page) |                                     
    +----+--------------------------+
    |  1 | {
      "addTabs": true,
      "components": [
        {
          "url": "facebook.com",
          "order": 3,
          "accountId": "123",
          "autoPilot": true
        },
        {
          "url": "www.youtube.com",
          "order": 7,
          "accountId": "987",
          "autoPilot": false
        }
      ],
      "activateHomeSection": true
    } |
    +----+--------------------------+
    

    Frankly, this is needlessly difficult and complex. It's going to be very time-consuming to develop any code to change JSON data if you store it this way. If you need to work with JSON as though it is a set of discrete rows and columns, then you should create a table with normal rows and columns. Then virtually anything you struggled to do with JSON functions becomes simple and straightforward:

    CREATE TABLE page_components (
      id INT PRIMARY KEY,
      accountId VARCHAR(36) NOT NULL,
      autoPilot BOOLEAN NOT NULL DEFAULT FALSE,
      `order` INT NOT NULL DEFAULT 1, 
      url TEXT NOT NULL,
      KEY (accountId, autoPilot)
    );
    

    ...populate it with data you would have stored in JSON...

    UPDATE page_components
    SET url = 'facebook.com'
    WHERE accountId = 123 AND autoPilot = true;
    

    The more I see questions on Stack Overflow about JSON in MySQL, the more I conclude that it's the worst feature ever to add to a relational database.