Search code examples
pythonjsonstringescaping

Why doesn't json.loads('{"Testing": "This quo\\\\"te String"}')) work?


I'm trying to understand why the code below doesn't work in Python:

import json

s = json.loads(' {"Testing" : "This quo\\\\"te String"} ')

print(s)

Theoretically, what I should get back is {'Testing' : 'This quo\"te String'}.

These ones work fine:

print(json.loads(' {"Testing" : "This quo\\"te String"} ')) ----> {'Testing' : 'This quo"te String'}

print(json.loads(' {"Testing" :"This quo\\\\\\"te String"}')) ----> {'Testing' : 'This quo\\"te String'}

I'm guessing it has something to do with the Idiosyncrasy of having a \" in the dict, but can't figure out what exactly is happening.


Solution

  • The string This quo\"te String requires two escapes in normal Python: one for the \ and one for the ", making three backslashes in all:

    >>> print("This quo\\\"te String")
    This quo\"te String
    

    For json, all those backslashes must be themselves escaped, because the string is embedded inside another string. Thus, six backslashes are required in total:

    >>> print(json.loads('"This quo\\\\\\"te String"'))
    This quo\"te String
    

    However, if raw-strings are used, no extra escapes are required:

    >>> print(json.loads(r'"This quo\\\"te String"'))
    This quo\"te String
    

    In your first example, the four backslashes will be parsed as a single literal \ (i.e. as an escaped backslash), leaving the " unescaped.

    Note that it makes no difference if the string is inside a dict - the result will be exactly the same:

    >>> dct = json.loads('{"Testing": "This quo\\\\\\"te String"}')
    >>> print(dct['Testing'])
    This quo\"te String