How to parse this string with PyParsing?

I want to parse :

'APPLE BANANA FOO TEST BAR'

into :

[['APPLE BANANA'], 'FOO', ['TEST BAR']]

Here is my latest attempt:

to_parse = 'APPLE BANANA FOO TEST BAR'
words = Word(alphas)
foo = Keyword("FOO")
parser = Group(ZeroOrMore(words + ~foo)) + foo + Group(ZeroOrMore(words))
result = parser.parseString(to_parse)

But it will return the following error:

>       raise ParseException(instring, loc, self.errmsg, self)
E       pyparsing.ParseException: Expected "FOO" (at char 6), (line:1, col:7)

I think that the problem comes from ZeroOrMore(words + ~foo)) which is "too greedy". According to few questions on SO, the solution is to use that negation with ~foo, but it doesn't work in this case. Any help would be appreciated

Solution

You are definitely on the right track. You just need to do the negative lookahead of foo before parsing a words:

parser = Group(ZeroOrMore(~foo + words)) + foo + Group(ZeroOrMore(words))

In recent pyparsing releases, I added a stopOn argument to ZeroOrMore and OneOrMore that does the same thing, to make this less error-prone:

parser = Group(ZeroOrMore(words, stopOn=foo)) + foo + Group(ZeroOrMore(words))

With this change I get:

>>> result.asList()
[['APPLE', 'BANANA'], 'FOO', ['TEST', 'BAR']]