Search code examples
pythonpyparsing

Matching several lines with either keywords or plain text


import pyparsing as pp

pp.ParserElement.setDefaultWhitespaceChars(" \t")

# End of line. Clean it off.
NL = pp.LineEnd().suppress()

repeat_stmt = pp.Keyword("REPEAT") + pp.pyparsing_common.number()

end_stmt = pp.Keyword("END")

statement = (repeat_stmt | end_stmt)
text = pp.Group(~statement + pp.restOfLine)

structure = pp.ZeroOrMore(statement | text)
structure.ignore(NL)

DATA = """
line 1
line 2
a bit longer line 3
REPEAT 123
foo bar
END
"""

print(structure.parseString(DATA))

I'm trying to build rather simple text generator with pyparsing.

Given above code I would expect get non keyword lines copied as is and parse keywords (currently all two of them).

Above code only end up in infinite loop

I would expect output to be something along the lines:

[
 ['line 2'], ['line 2'], ['a bit longer line 3'],
 [['REPEAT', 123]], ['foo bar'] [['END'],
]

How I can achieve that?


Solution

  • If you enable debugging on NL parsing, using setDebug,

    NL = pp.LineEnd().suppress().setDebug()
    

    you'll see that NL loops forever at the end of the input string. You can break this loop while the ZeroOrMore using stopOn:

    structure = pp.ZeroOrMore(statement | text, stopOn=pp.StringEnd())
    

    With this change, you'll get:

    [['line 1'], ['line 2'], ['a bit longer line 3'], 'REPEAT', 123, ['foo bar'], 'END', ['']]