Search code examples
pythonstructurepyparsing

Can I overwrite operator_rules of infix_notation and keep pyparsing code dry when using forward-defined elements?


In shortened and simplified pseudocode I have something like this:

arg = Forward()
...
...
func_call = somestuff + arg

term = ... | ... | arg

expression = infix_notation(term, operator_rules)
...
...
arg <<= expression | func_call | ... | ...
...
statement = var + ASSIGN + expression
code_line = Optional(White()) + (statement | func_call | expression) + Optional(dbl_slash_comment).leaveWhitespace()

So the issue that I don't know how to solve here is that I need pretty much the same code but with different parse_actions inside the operator_rules. If the expression was the last thing defined here (without relying on any Forward() components) then I could just create a module where I kept everything else, then just import term into 2 different modules and have different expression infix_notations with the different parse actions inside operator_rules as I needed.

My current solution is just to create 2 identical modules with this code, then modify the parse_action functions inside one of them. But it feels like a bad practice. If I ever need to change something in this parser, I'll need to change it in both modules.

I feel like I don't know something here about how pyparsing works.

I know that I can import a ParserElement and rewrite its parse action by using set_parse_action(), but don't know how to do it with an infix_notation element. Another issue is that the Forward-defined arg depends on that infix_notation element, so even if I could just simply import the expression and rewrite its parse actions inside the infix_notation, that wouldn't rewrite the elements that went after the expression and the forward-defined arg, right?

I tried putting the whole parser inside a function and then creating the infix_notation with different operator rules with if statements that look at the input. It works, but this seems bulky and still doesn't solve the "import of some elements which are involved with forward-defined term". In fact, since I've put the whole thing inside a function now I can't even import the final code_line ParserElement to be used in another module where I define conditional structures.

So I think the problem is 2-fold:

  1. is there any way to overwrite the operator_rules (including the parse actions and simply the total list of operator rules) of the infix_notation? (why I even really need it - it's partially because as the number of nested expressions grows the execution time seems to grow exponentially, so I limit the number of operators defined based on what the string I'm parsing has)
  2. the forward-defined elements that are used as the term in the infix_notation seem to prevent me from importing the infix_notation element to rewrite the operator_rules, if that even were possible.

I'm not sure there even is an elegant solution here that'd keep the code DRY, but any insight/advice is very appreciated!


Solution

  • Instead of modifying the parse actions inside the infix notation expression, try wrapping the infix_notation part of your code in a function that takes an argument containing the functions.

    Here is a function that builds an infix notation parser using the static methods of a class passed in as an argument:

    import pyparsing as pp
    
    def make_custom_infix_notation(fn_object):
        operand = pp.common.number()
    
        expr = pp.infix_notation(
            operand,
            [
                (pp.oneOf("* /"), 2, pp.OpAssoc.LEFT, fn_object.mul),
                (pp.oneOf("+ -"), 2, pp.OpAssoc.LEFT, fn_object.add),
            ]
        )
        return expr
    

    Here is a class that converts each binary operation into a function call taking 2 arguments:

    class MakeIntoFunctions:
        @staticmethod
        def mul(tokens):
            tokens = tokens[0]
            cur_op = str(tokens[0])
            for operator, operand in zip(tokens[1::2], tokens[2::2]):
                function = {'*': 'MUL', '/': 'DIV'}[operator]
                cur_op = f"{function}({cur_op},{operand})"
            return cur_op
    
        @staticmethod
        def add(tokens):
            tokens = tokens[0]
            cur_op = str(tokens[0])
            for operator, operand in zip(tokens[1::2], tokens[2::2]):
                function = {'+': 'ADD', '-': 'SUB'}[operator]
                cur_op = f"{function}({cur_op},{operand})"
            return cur_op
    

    And here is an example using this function:

    parser = make_custom_infix_notation(MakeIntoFunctions)
    print(parser.parse_string("1+2/3+11/(12+7)"))
    

    which prints:

    ['ADD(ADD(1,DIV(2,3)),DIV(11,ADD(12,7)))']
    

    You could write another class that converts to postfix, for example, and then just pass that class in instead.

    EDIT: The object passed in doesn't have to be a class, it could just as easily be a namedtuple or typing.NamedTuple, with parse actions for members. Or even just a dict (though then you would need to change the references to be done my key string instead of by attribute name).