Search code examples

pyparsing not parsing the whole string

I have the following grammar and test case:

from pyparsing import Word, nums, Forward, Suppress, OneOrMore, Group

#A grammar for a simple class of regular expressions
number = Word(nums)('number')
lparen = Suppress('(')
rparen = Suppress(')')

expression = Forward()('expression')

concatenation = Group(expression + expression)

disjunction = Group(lparen + OneOrMore(expression + Suppress('|')) + expression + rparen)

kleene = Group(lparen + expression + rparen + '*')

expression << (number | disjunction | kleene | concatenation)

#Test a simple input
tests = """

for t in tests:
    print t
    print expression.parseString(t)

The result should be

[['8', '*'],[['3', '2'], '2']]

but instead, I only get

[['8', '*']]

How do I get pyparsing to parse the whole string?


  • Your concatenation expression is not doing what you want, and comes close to being left-recursive (fortunately it is the last term in your expression). Your grammar works if you instead do:

    expression << OneOrMore(number | disjunction | kleene)

    With this change, I get this result:

    [['8', '*'], [['3', '2'], '2']]

    EDIT: You an also avoid the precedence of << over | if you use the <<= operator instead:

    expression <<= OneOrMore(number | disjunction | kleene)