I am using pyparsing
, and to parse some text, I created a grammar, and it works as expected, however, for a expression like this one:
OR(OR(in1, in2), in3)
I want to replace the nested expression, to an "alias" and then create an expression for this alias, in simple words:
# I have this expression ( OR(OR(in1, in2), in3) )
# Which I parsed to
parsed = ["OR", [["OR", ["in1", "in2"]], "in3"]]
# I want to have
exp1 = ["OR", ["in1", "in2"]]
exp2 = ["OR", ["exp1", "in3"]]
This is a minimal example, however I can have any nested "expressions" (with only two arguments). Is there a way to do this?
Here is a parser that is probably similar to the one you wrote:
import pyparsing as pp
LPAR, RPAR = map(pp.Suppress, "()")
OR = pp.Keyword("OR")
term = pp.pyparsing_common.identifier
or_expr = pp.Forward()
or_expr <<= pp.Group(OR + pp.Group(LPAR + pp.delimitedList(or_expr | term)) + RPAR)
When it parses the string you gave, it provides the same nested output.
To create the "expN" expression names, you can use a parse action to gather up the expressions, and associated expression id, in a global list var:
# add parse action to convert OR's to exprs
exprs = []
def generate_expr_definition(tokens):
expr_name = "exp{}".format(len(exprs)+1)
exprs.append((expr_name, tokens.asList()[0]))
return expr_name
or_expr.addParseAction(generate_expr_definition)
When you run this parser, the created results aren't the important part. What is important is the exprs
list that was built while parsing:
or_expr.parseString(sample)
# generate assignments for each nested OR expr
for name, expr in exprs:
print("{} = {}".format(name, expr))
This gives:
exp1 = ['OR', ['in1', 'in2']]
exp2 = ['OR', ['exp1', 'in3']]
Now I look at that, and ask, "how will I know the difference between 'exp1'
that was parsed from the input vs. 'exp1'
that is supposed to represent a parsed expression. If this is to be interpreted as a Python assignment, it should really read:
exp2 = ['OR', [exp1, 'in3']]
with no quotes around the variable name.
To do this, we need to return an object from the parse action that will repr
as the name without the surrounding quotes. Like this:
class ExprName:
def __init__(self, name):
self._name = name
def __repr__(self):
return self._name
Change the return statement in the parse action to:
return ExprName(expr_name)
And the resulting output now looks like:
exp1 = ['OR', ['in1', 'in2']]
exp2 = ['OR', [exp1, 'in3']]
Now you can distinguish the generated expN
vars from parsed inputs.