Search code examples
pythonjsonescaping

Python: json.loads chokes on escapes


I have an application that is sending a JSON object (formatted with Prototype) to an ASP server. On the server, the Python 2.6 "json" module tries to loads() the JSON, but it's choking on some combination of backslashes. Observe:

>>> s
'{"FileExists": true, "Version": "4.3.2.1", "Path": "\\\\host\\dir\\file.exe"}'

>>> tmp = json.loads(s)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  {... blah blah blah...}
  File "C:\Python26\lib\json\decoder.py", line 155, in JSONString
    return scanstring(match.string, match.end(), encoding, strict)
  ValueError: Invalid \escape: line 1 column 58 (char 58)

>>> s[55:60]
u'ost\\d'

So column 58 is the escaped-backslash. I thought this WAS properly escaped! UNC is \\host\dir\file.exe, so I just doubled up on slashes. But apparently this is no good. Can someone assist? As a last resort I'm considering converting the \ to / and then back again, but this seems like a real hack to me.

Thanks in advance!


Solution

  • The correct json is:

    r'{"FileExists": true, "Version": "4.3.2.1", "Path": "\\\\host\\dir\\file.exe"}'
    

    Note the letter r if you omit it you need to escape \ for Python too.

    >>> import json
    >>> d = json.loads(s)
    >>> d.keys()
    [u'FileExists', u'Path', u'Version']
    >>> d.values()
    [True, u'\\\\host\\dir\\file.exe', u'4.3.2.1']
    

    Note the difference:

    >>> repr(d[u'Path'])
    "u'\\\\\\\\host\\\\dir\\\\file.exe'"
    >>> str(d[u'Path'])
    '\\\\host\\dir\\file.exe'
    >>> print d[u'Path']
    \\host\dir\file.exe
    

    Python REPL prints by default the repr(obj) for an object obj:

    >>> class A:
    ...   __str__ = lambda self: "str"
    ...   __repr__  = lambda self: "repr"
    ... 
    >>> A()
    repr
    >>> print A()
    str
    

    Therefore your original s string is not properly escaped for JSON. It contains unescaped '\d' and '\f'. print s must show '\\d' otherwise it is not correct JSON.

    NOTE: JSON string is a collection of zero or more Unicode characters, wrapped in double quotes, using backslash escapes (json.org). I've skipped encoding issues (namely, transformation from byte strings to unicode and vice versa) in the above examples.