pyparsing: Unable to get results from ParseResults object

>>> from pyparsing import Word, alphanums, OneOrMore, Optional, Suppress

>>> var = Word(alphanums)
>>> reg = OneOrMore(var('predictors') + Optional(Suppress('+'))) + '~' + OneOrMore(var('covariates') + Optional(Suppress('+')))

>>> string = 'y1 ~ f1 + f2 + f3'
>>> reg.parseString(string)
(['y1', '~', 'f1', 'f2', 'f3'], {'predictors': ['y1'], 'covariates': ['f1', 'f2', 'f3']})

It is able to parse things correctly but I am unable to get all the values of predictors and covariates. It only seems to store the last value:

>>> results = reg.parseString(string)
>>> results.covariates
'f3'
>>> results['covariates']
'f3'

I would like to get all the values in predictors and covariates as lists . Any ideas why this is happening?

Solution

Results names by default use similar logic as Python dicts: if there are multiple values assigned for the same key, only the last assigned value is kept.

However, this behavior can be overridden, depending how the parser defines the results names.

If using the full expr.setResultsName("XYZ") form, add listAllMatches=True argument. This tells pyparsing to keep a list of all parsed values and return them as a list.

If using the short-cut expr("XYZ") form, add a '*' to the end of the name: expr("XYZ*"). This is equivalent to setting listAllMatches to True.

The trailing '*' is there in setResultsName for those cases where you use the short form of setResultsName: expr("name*") vs expr.setResultsName("name", listAllMatches=True). If you prefer calling setResultsName, then do not use the '*' notation, but instead pass the listAllMatches argument.