Search code examples
mysqlmysql-json

Using the LIKE operator to look for matches in a json-like string


I need to find those notifications that have the eventCode set to REFUND. Here is the object stored in the DB:

{
    "target": "target",
    "method": "POST",
    "headers": {
        "x-forwarded-host": "x-forwarded-host",
        "x-real-ip": "x-real-ip",
        "x-forwarded-for": "x-forwarded-for",
        "x-forwarded-proto": "x-forwarded-proto",
        "host": "host",
        "connection": "connection",
        "content-length": "content-length",
        "cf-ipcountry": "cf-ipcountry",
        "accept-encoding": "accept-encoding",
        "cf-ray": "cf-ray",
        "cdn-loop": "cdn-loop",
        "cf-connecting-ip": "cf-connecting-ip",
        "cf-visitor": "cf-visitor",
        "traceparent": "traceparent",
        "content-type": "application/json; charset=utf-8",
        "user-agent": "user-agent"
    },
    "body": "{\"live\":\"true\",\"notificationItems\":[{\"NotificationRequestItem\":{\"additionalData\":{\"hmacSignature\":\"yadayadayada\",\"bookingDate\":\"2023-01-10T11:54:37Z\"},\"amount\":{\"currency\":\"EUR\",\"value\":2500},\"eventCode\":\"REFUND\",\"eventDate\":\"2023-01-10T11:53:43+02:00\",\"merchantAccountCode\":\"yadayada\",\"merchantReference\":\"yadayada\",\"originalReference\":\"yadayada\",\"paymentMethod\":\"mc\",\"pspReference\":\"yadayada\",\"reason\":\"\",\"success\":\"true\"}}]}"
}

I'm using a zsh terminal and trying to write a query that would parse that body string:

select * from notifications where order_id=123456 and json_extract(data, '$.body') like '%\"eventCode\":\"REFUND\"%' \G

It does not work. By trial and error I found out that this works:

select * from notifications where id=123456 and json_extract(data, '$.body') like '%\"REFUND%' \G

but this does not:

select * from notifications where id=123456 and json_extract(data, '$.body') like '%:\"REFUND%' \G

Looks like the semicolon breaks it somehow.

What am I doing wrong?

UPDATE:

So I incorporated this query into a bigger one to look like this:

SELECT concat(date(o.timestamp), ' ', extract(hour FROM o.timestamp), 'h') AS date,
       time_to_sec((timediff(n.timestamp, o.timestamp))) / 60 AS 'timedelay (minutes)',
       o.id AS order_id,
       c.user_id AS user_id
FROM notifications n
         JOIN orders AS o ON n.order_id = o.id
         JOIN credentials AS c ON o.credential_id = c.id
         JOIN merchants AS m ON c.merchant_id = m.id
         JOIN providers AS p ON m.provider_id = p.id
         CROSS JOIN JSON_TABLE(
            JSON_UNQUOTE(JSON_EXTRACT(n.data, '$.body')),
            '$.notificationItems[*]' COLUMNS (
            eventCode VARCHAR(20) PATH '$.NotificationRequestItem.eventCode'
        )) AS j
WHERE p.name = 'adyen'
  AND o.timestamp > '2024-04-11'
  AND time_to_sec((timediff(n.timestamp, o.timestamp))) / 60 > 60
  AND o.type = 'payment'
  AND j.eventCode = 'AUTHORISATION';

Except now it deals with a different event type. This query errors out with ERROR 3141 (22032): Invalid JSON text in argument 1 to function json_table: "Invalid value." at position 0.

One important detail here is that the tables notifications, credentials, merchants have a field called data. It looks like the query gets confused despite the use of aliases and the presence of no compelling reason for ambiguity.

One more observation:

This query

SELECT n.id FROM notifications AS n
JOIN orders o ON n.order_id = o.id
WHERE n.provider_id = 1
AND o.type > 'payment'
AND JSON_EXTRACT(JSON_UNQUOTE(JSON_EXTRACT(n.data,'$.body')), '$.notificationItems[0].NotificationRequestItem.eventCode') = 'AUTHORISATION'
LIMIT 5;

Errors out with ERROR 3141 (22032): Invalid JSON text in argument 1 to function json_extract: "The document is empty." at position 0..

But this query:

SELECT n.id FROM notifications AS n
WHERE n.order_id IN (
    SELECT o.id FROM orders AS o WHERE o.type = 'payment'
)
AND n.provider_id = 1
AND JSON_EXTRACT(
    JSON_UNQUOTE(JSON_EXTRACT(n.data,'$.body')),
    '$.notificationItems[0].NotificationRequestItem.eventCode'
) = 'AUTHORISATION'
LIMIT 5 \G

works just fine.

I've been working on rewriting the query in such a way that I could avoid using joins, but so far didn't come up with a decent solution.

As for the version:

mysql> SELECT Version();
+-----------+
| Version() |
+-----------+
| 8.0.35-27 |
+-----------+
1 row in set (0.00 sec)

Solution

  • I assume your notificationItems array may have more than one element, and you need to search for your REFUND value in any element of the array.

    If so, you need to use JSON_TABLE():

    SELECT n.id, j.eventCode
    FROM notifications AS n
    CROSS JOIN JSON_TABLE(
      JSON_UNQUOTE(JSON_EXTRACT(data, '$.body')),
      '$.notificationItems[*]' COLUMNS (
        eventCode VARCHAR(20) PATH '$.NotificationRequestItem.eventCode'
      )) AS j
    WHERE n.id=123456 AND j.eventCode = 'REFUND';