Search code examples
mysqljsonescaping

mysql 8.0 process json with escape mark '\'


SET @v = '{"a":" \\u3b7e - c:\\user\\abc - \\"ET\\" "}';

SET @j = CAST(@v AS JSON);

failed Error Code: 3141. Invalid JSON text in argument 1 to function cast_as_json: "Incorrect hex digit after \u escape in string." at position 18. because cast_as_json will try to decode '\user' which is not a escape character.

if it like this c:\\\\user\\\\abc, everything fine.

Tried SET @v = REPLACE(@v, '\\', '\\\\');, it will break other string escape.

Tried SET @v = CONVERT(@v USING ASCII);, No matter what charset, no different

Mysql rule '\\' -> '\', '\u3b7e' -> '㭾'. Json rule " -> " in string

I don't know how to solve this.


Solution

  • Finally, I found the root cause is: In JSON, a Unicode surrogate pair must consist of two valid sequences: High surrogate: \uD800-\uDBFF Low surrogate: \uDC00-\uDFFF MySQL interprets JSON strictly and will throw an error if: The surrogate pair is incomplete or malformed. The string contains improperly escaped sequences.

    and I just replaced \ -> \\, which will make '\"', separately '\\' and '"', create error.

    Finally I replace all possible unicode surrogate to 0000. so far no more fail. Hope helps other people.

    SET @jsn = REGEXP_REPLACE(@jsn, '\\\\u[dD][0-9a-fA-F]{3}', '\\\\uD000');