I am trying to parse expressions with pyparsing, and can do that with infix_notation
, but the problem is that it matches lines that have no operations, and just match the base_expr argument. This is a problem because valid keywords can be matched by the base_expr.
I use this as the infix_notation
expression = infix_notation(Word(
printables,
exclude_chars="** ~ + - * / % & | ^ != == <= >= < > ! , += -= *= /= %= <<= >>= &= |= ^="
),
[
("**", 2, OpAssoc.LEFT),
(one_of("~ + -"), 1, OpAssoc.RIGHT),
(one_of("* / % *= /= %="), 2, OpAssoc.LEFT),
(one_of("<< >> <<= >>="), 2, OpAssoc.LEFT),
(one_of("& | ^ &= |= ^="), 2, OpAssoc.LEFT),
(one_of("+ - += -="), 2, OpAssoc.LEFT),
(one_of("!= == <= >= < >"), 2, OpAssoc.LEFT),
(one_of("&& ||"), 2, OpAssoc.LEFT),
("!", 1, OpAssoc.RIGHT),
])
The problem match is this
Word(
printables,
exclude_chars="** ~ + - * / % & | ^ != == <= >= < > ! , += -= *= /= %= <<= >>= &= |= ^="
)
So this would match the keyword "else" which I do not want, but it also needs to match variables in an expression like "else1 += else2".
How would you do this?
A common way to differentiate keywords from identifiers is to define an expression for any keyword like this (get the list of all Python keywords, but you can define your own list):
from keyword import kwlist
any_keyword = pp.one_of(kwlist, as_keyword=True)
infix_term = Word(
printables,
exclude_chars="** ~ + - * / % & | ^ != == <= >= < > ! , += -= *= /= %= <<= >>= &= |= ^="
)
operand = ~any_keyword + infix_term
expression = infix_notation(operand,
... etc. ...
Note that your Word(printables, ...)
expression for an infix_term will match almost anything, including ......
, integers, floats, etc. Also, the exclude_chars argument does not split the string into operators, but just uses all the chars in the string. So you would not be able to use "-10" as a term, since "-" is in the set of exclude_chars. So give a little more thought as to how to best define your operands.
Lastly, your infix_notation list of operators is pretty long, and this will be a sloooooooowwwww parser if you don't enable packrat parsing (using ParserElement.enable_packrat()
.