Search code examples
pythonstring-conversion

Convert string to json where some values are wrapped in square brackets


What would be the most efficient way (python 3) to convert a string to dict, where some of the keys and values may or may not be quoted and some values may contain sub objects wrapped in square brackets ([]) instead of curly ones.

Also values may contain colon (:) in them

Example:

string = "[key:value, key2:[key2a:val2:a, key2b:[foo:"bar"]]]"

the results needs to be a valid dict like:

{"key":"value", "key2":{"key2a":"val2:a", "key2b":{"foo":"bar"}}}


Solution

  • You can use a recursive generator function:

    import re
    string = "[key:value, key2:[key2a:val2:a, key2b:[foo:'bar']]]"
    d = [i if not i.startswith("'") else i[1:-1] for i in re.findall("[\[\]]|:|'.*?'|\w+|,", string)[1:-1]]
    def to_dict(d):
       while (n:=next(d, None)) not in {None, ']'}:
          _ = next(d)
          if (v:=next(d)) == '[':
             v = dict(to_dict(d))
          else:
             c = [v]
             while (j:=next(d)) not in {',', ']'}:
                c.append(j)
             if j == ']':
                d = iter([*d, j])
             v = ''.join(c)
          yield (n, v)
    
    print(dict(to_dict(iter(d))))
    

    Output:

    {'key': 'value', 'key2': {'key2a': 'val2:a', 'key2b': {'foo': 'bar'}}}
    

    Edit: solution without assignment expressions (:= walrus operator):

    def to_dict(d):
       n = next(d, None)
       while n not in {None, ']'}:
          _, v = next(d), next(d)
          if v == '[':
             v = dict(to_dict(d))
          else:
             c, j = [v], next(d)
             while j not in {',', ']'}:
                c.append(j)
                j = next(d)
             if j == ']':
                d = iter([*d, j])
             v = ''.join(c)
          yield (n, v)
          n = next(d, None)
    
    
    print(dict(to_dict(iter(d))))