Search code examples
pythonregexstringparsingstack

How to convert a string mixed with infix and prefix sub-expressions to all prefix sub-expressions?


Consider I have a string formula which is written in this format: "func(a+b,c)", where func is a custom function, this string contains both infix(i.e. the +) and prefix(i.e. the func) representations, I'd like to convert it to a string with all prefix representations, "func(+(a,b), c)", how can I do that?

Another example is "func1(a*(b+c),d)-func2(e,f)" to "-(func1(*(a, +(b,c)), d), func2(e,f))"

More background: I want to build a parser, where the user would input the expressions in string format just as described above, normally a custom function would be written as a prefix expression, i.e. "func(a,b)", but due to the common convention, people would still write the +-*/ in an infix expression, i.e. "a+b".

If an expression is given in all infix format, I can easily convert it to a tree object I predefined, but if an expression is mixed with both prefix and infix formats, I have no idea how to convert it to a tree object, thus asking how to convert the string to all infix formats.

I have no experience with parsers, appreciate any initial guidance.


Solution

  • If the language you are parsing is that similar to Python, you can just use the Python parser as provided by the built-in ast module, and implement a visitor over the nodes that interest you, in order to build up the prefix expression. For example, you could try this:

    import ast
    
    def printc(*args): print(*args, end='')
    
    class PrefixPrinter(ast.NodeVisitor):
        def visit_BinOp(self, node):
            if type(node.op) == ast.Add:
                printc("+")
            if type(node.op) == ast.Sub:
                printc("-")
            if type(node.op) == ast.Mult:
                printc("*")
            if type(node.op) == ast.Div:
                printc("/")
            printc("(")
            self.visit(node.left)
            printc(",")
            self.visit(node.right)
            printc(")")
    
        def visit_Call(self, node):
            self.visit(node.func)
            printc("(")
            for index, arg in enumerate(node.args):
                self.visit(arg)
                if index < len(node.args) - 1:
                    printc(", ")
            printc(")")
    
        def visit_Name(self, node):
            printc(node.id)
    
    first_example = "func(a+b,c)"
    second_example = "func1(a*(b+c),d)-func2(e,f)"
    printc(f"{first_example=} converts to ")
    PrefixPrinter().visit(ast.parse(first_example))
    print()
    printc(f"{second_example=} converts to ")
    PrefixPrinter().visit(ast.parse(second_example))
    print()
    

    Execution:

    $ python3 to_prefix.py 
    first_example='func(a+b,c)' converts to func(+(a,b), c)
    second_example='func1(a*(b+c),d)-func2(e,f)' converts to -(func1(*(a,+(b,c)), d),func2(e, f))