Search code examples
mysqlescapingmysql-jsonjson-table

JSON_TABLE - how to access fields inside escaped quotes?


I've got a large JSON field that I'm trying to parse out using JSON_TABLE. About half the data is straightforward, like:

"run_summary":{"num_records":6,"num_runtime_errors":0,"num_checkpoint_failed":4,"num_files":4}

But the other half contains escaped quotes, such as:

"current_review_id":"{\"value\": \"b13d2f43-7abb-40f8-9867-6349c0c84c47\", \"recorded_timestamp\": 1696865721868}"

I have no trouble breaking out the columns from the run_summary, i.e.

num_records int PATH '$.run_summary.num_records'

However, I can't figure out how to get at the ones inside the escaped quotes...this does not work, for example:

value varchar(40) PATH  '$.flow_job_metrics.current_review_id.value' 

...which just returns null.

I have tried putting additional quotes inside the path, like:

 '$.flow_job_metrics.current_review_id."value"' 

and other crazy things like:

 '$.flow_job_metrics.current_review_id.\"value\"' 

but these either result in syntax errors or more null values.

Can anyone help me figure out how to get at these?


Solution

  • Since the JSON is actually double-encoded, that is, part of your JSON is just a string, which is part of the "outer" JSON document, you'll have to double-decode it:

    create table mytable (
      id serial primary key,
      data json
    );
    
    insert into mytable set data = '{"current_review_id":"{\\"value\\": \\"b13d2f43-7abb-40f8-9867-6349c0c84c47\\", \\"recorded_timestamp\\": 1696865721868}"}';
    
    select id, j1.current_review_id, j2.value, j2.recorded_timestamp from mytable
    cross join json_table(data, '$' columns (
        current_review_id json path '$.current_review_id'
      )) as j1
    cross join json_table(json_unquote(j1.current_review_id), '$' columns (
        value varchar(36) path '$.value',
        recorded_timestamp bigint path '$.recorded_timestamp'
      )) as j2;
    

    Output, tested on MySQL 8.4.3:

    *************************** 1. row ***************************
                    id: 1
     current_review_id: "{\"value\": \"b13d2f43-7abb-40f8-9867-6349c0c84c47\", \"recorded_timestamp\": 1696865721868}"
                 value: b13d2f43-7abb-40f8-9867-6349c0c84c47
    recorded_timestamp: 1696865721868