Search code examples
pythonlistparsingabstract-syntax-tree

Parse nested list from string that cannot be parsed with ast.literal_eval


I parse a file to a Python list and I encountered a nested list like this:

{   1   4{  2a  0.0 }{  3   0.0 }{  4c  0.0 }{  5   0.0 }   }

I want to interpret it as a list, yet nested, so I want to be the resulting Python list as follows:

[1,4,[2a,0.0],[3,0.0],[4c,0.0],[5,0.0]]

I manage to do a correct string of this with a following:

l = """{    1   4{  2   0.0 }{  3   0.0 }{  4   0.0 }{  5   0.0 }   }"""
l = l.replace("{\t",",[").replace("\t}","]").replace("{","[").replace("}","]").replace("\t",",")[1:]

I can also apply l.strip("\t") so that it is a list, but not for a nested, otherwise it will be flattened, which I do not want.

I tried with ast.literal_eval(l), but it fails on strings e.g. 2a


Solution

  • Pyparsing has a built-in helper nestedExpr to help parse nested lists between opening and closing delimiters:

    >>> import pyparsing as pp
    >>> nested_braces = pp.nestedExpr('{', '}')
    >>> t = """{   1   4{  2a  0.0 }{  3   0.0 }{  4c  0.0 }{  5   0.0 }   }"""
    >>> print(nested_braces.parseString(t).asList())
    [['1', '4', ['2a', '0.0'], ['3', '0.0'], ['4c', '0.0'], ['5', '0.0']]]