Search code examples
pyparsing

Pyparser grammar not parsing correctly


Here is my grammar:

from pyparsing import Combine, Forward, Group, Literal, Optional, Word
from pyparsing import alphas, delimitedList, infixNotation, nums, oneOf, opAssoc,     operatorPrecedence, quotedString, removeQuotes

integer = Combine(Optional(oneOf("+ -")) + Word(nums)).setParseAction(lambda t: int(t[0]))
real = Combine(Optional(oneOf("+ -")) + Word(nums) + "." +     Optional(Word(nums))).setParseAction(
lambda t: float(t[0]))
variable = Word(alphas)
qs = quotedString.setParseAction(removeQuotes)
lt_brac = Literal('[').suppress()
rt_brac = Literal(']').suppress()

exp_op = Literal('^')
mult_op = oneOf('* /')
plus_op = oneOf('+ -')
relation = oneOf('== != < >')
regex_compare = Literal('~')

function_call = Forward()
operand = function_call | qs | real | integer | variable

expr = operatorPrecedence(operand,
                      [
                          (":", 2, opAssoc.LEFT),
                          (exp_op, 2, opAssoc.RIGHT),
                          (regex_compare, 2, opAssoc.LEFT),
                          (mult_op, 2, opAssoc.LEFT),
                          (plus_op, 2, opAssoc.LEFT),
                          (relation, 2, opAssoc.LEFT)
                      ])

bool_operand = expr
bool_expr = infixNotation(bool_operand,
                      [
                          ("not", 1, opAssoc.RIGHT),
                          ("and", 2, opAssoc.LEFT),
                          ("or", 2, opAssoc.LEFT),
                          ])

function_param = function_call | expr | variable | integer | real
function_call <<= Group(variable + lt_brac + Group(Optional(delimitedList(function_param))) + rt_brac)

final_expr = Group(function_call | bool_expr | expr  )
final_expr.enablePackrat()


def parse(expression):
    return final_expr.parseString(expression)

The above grammar is suppose to parse arithmetic expression, relations statements like (<, >, !=, ==) the operands can be arithmetic expressions, bool expression ( or, and, not) the operands can be arithmetic or relational statement.

The grammar supports functions in the form of []. Params can be arithmetic expression.

This works fine in most cases. However I have the following question, using the above grammar when I try to parse

print(parse(""abs[abc:sec - abc:sge] > 1")

I get the following output

[[['abs', [[['abc', ':', 'sec'], '-', ['abc', ':', 'sge']]]]]]

Why is the ' > 1' ignored?


Solution

  • It's ignored because of this definition of final_expr:

    final_expr = Group(function_call | bool_expr | expr  )
    

    Why do you define this expression this way? An expr is a simple bool_expr, and a function_call is a simple expr. Just do this:

    final_expr = bool_expr
    

    And you'll parse your given expression as:

    [[['abs', [[['abc', ':', 'sec'], '-', ['abc', ':', 'sge']]]], '>', 1]]