Search code examples
pythonpyparsing

PyParsing, matching lines beginning with White


Given following grammar sensitive to newlines, how can i ignore comments beginning with spaces?

The pp.LineStart() + pp.Optional(pp.White(" \t")) + '#' does not match lines beginning with spaces as one would expect.

import pyparsing as pp
pp.ParserElement.setDefaultWhitespaceChars(' \t')

def Line(expr): return expr + pp.Suppress(pp.LineEnd())

foo = Line(pp.Group(pp.OneOrMore(pp.Word(pp.alphas))))

parser = pp.OneOrMore(foo)

comment = '#' + pp.restOfLine()
parser.ignore(pp.LineStart() + pp.Optional(pp.White(" \t")) + pp.Optional(comment) + pp.LineEnd())
parser.ignore(comment)

text = """

 foo abc
# comment
bar # comment
"""

results = parser.parseString(text, parseAll=True)
assert list(results[0]) == ['foo', 'abc']

text = """

foo abc
 # comment
bar
"""

results = parser.parseString(text, parseAll=True)

print "ok"

Solution

  • Lines containing just a comment can leave a dangling LineEnd in the input. Change parser to:

    parser = pp.OneOrMore(foo | pp.LineEnd().suppress())