Search code examples
mysqljsoncommon-schema

Extracting value using common_schema not working for nested JSON


I have a DB field called data of type is MEDIUMTEXT, which stores values in JSON format. I'm using extract_json_value method from common_schema.

When JSON is not nested, it works fine. For example, when applications_data table's data field is

{
  "key": "value"
}

This query works fine:

SELECT data into @json from applications_data;
SELECT common_schema.extract_json_value(@json, 'key') as result;

and gives result: key

However, when the data field is a nested JSON, it fails. For example, JSON is:

{
  "key": {
    "overview": "sample"
  }
}

Using the same query as above and result is empty, not NULL:

enter image description here


Solution

  • Remember:

    extract_json_value

    ...

    This function internally relies on json_to_xml(): it first converts the JSON data to XML, then uses ExtractValue to apply XPath.

    ...

    and

    ExtractValue(xml_frag, xpath_expr)

    ...

    If no matching text node is found for the expression (including the implicit /text())—for whatever reason, as long as xpath_expr is valid, and xml_frag consists of elements which are properly nested and closed—an empty string is returned. No distinction is made between a match on an empty element and no match at all. This is by design.

    If you need to determine whether no matching element was found in xml_frag or such an element was found but contained no child text nodes, you should test the result of an expression that uses the XPath count() function.

    ...

    Test:

    mysql> SET @`json` := '
        '> {
        '>   "key": {
        '>     "overview": "sample"
        '>   }
        '> }
        '> ';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT 
        ->   common_schema.extract_json_value(@`json`, 'key') AS result0,
        ->   common_schema.extract_json_value(@`json`, count('key')) AS result1,
        ->   common_schema.extract_json_value(@`json`, 'key/overview') AS result2,
        ->   common_schema.extract_json_value(@`json`, count('key/overview')) AS result3;
    +---------+---------+---------+---------+
    | result0 | result1 | result2 | result3 |
    +---------+---------+---------+---------+
    |         | 1       | sample  | 1       |
    +---------+---------+---------+---------+
    1 row in set (0.03 sec)