Search code examples
pythonpyparsing

Is this technique acceptable for constructing ParseResults in pyparsing?


I feel that ParseActions make my code a bit clunky when trying to construct the resulting parse tree (known as ParseResults in pyparsing).

What I'm doing now is to have global variables that store groups of matched tokens that are returned by the Group element. Then at the end, I will inject the results back into the toks dictionary. Is this ok ?

My sketchy grammar:

grammar = ZeroOrMore( Or( ExprA, ExprB, ExprC ) )

Note that ExprA, ExprB etc. can interleave in any order. But I want to group all expressions of one type into one dictionary entry in ParseResults. What do you think of my technique ? I don't like to use global variables because it makes multithreading a problem. Do I have other choices ?


Solution

  • Have you thought about using setResultsName with listAllMatches=True? Here's a demo:

    from pyparsing import *
    
    aExpr = Word("A", nums)
    bExpr = Word("B", nums)
    cExpr = Word("C", nums)
    
    grammar = ZeroOrMore(aExpr.setResultsName("A",listAllMatches=True) |
                         bExpr.setResultsName("B",listAllMatches=True) |
                         cExpr.setResultsName("C",listAllMatches=True) )
    
    
    results = grammar.parseString("A1 B1 A2 C1 B2 A3")
    print results.dump()
    

    prints:

    ['A1', 'B1', 'A2', 'C1', 'B2', 'A3']
    - A: ['A1', 'A2', 'A3']
    - B: ['B1', 'B2']
    - C: ['C1']
    

    EDIT:

    The newer form for this would be:

    grammar = ZeroOrMore(aExpr("A*") | bExpr("B*") | cExpr("C*") )
    

    I found ".setResultsName" to be too verbose and cluttering when defining grammars, which worked against my intention of encouraging people to use results names.