Search code examples
python-3.xpyparsing

Pyparsing trouble passing keywords to the parser


I have been googleing for a while and from here (credits to Mr. Paul) I had how to pass identifiers to the parser. Here is what I have so far:

from pyparsing import *

class Expressions:
    ParserElement.enablePackrat()

    arith_expr = Forward()

    num = Word(nums) + Optional("." + OneOrMore(Word(nums)))
    opmd = Word("*/", max=1)
    opss = Word("+-", max=1)
    ident = Word(alphas + "_", alphanums + "_")

    fn_call = Group(Optional(delimitedList(arith_expr)))
    arith_operand = fn_call | num | ident

    arith_expr <<= infixNotation(arith_operand, [
        ('-', 1, opAssoc.RIGHT),
        (opmd, 2, opAssoc.LEFT,),
        (opss, 2, opAssoc.LEFT,)
    ])

    def __init__(self, vars):
        if isinstance(vars, list) and vars:
            ids = []
            for x in vars:
                ids.append(x)
            self.ident = MatchFirst(map(Keyword, ids))

    def check(text):
        try:
            result = self.arith_expr.parseString(texto, True)
            print(result)
        except ParseException as exc:
            print(exc)

Then if I go to the python console and do this:

vars = ['v1', 'v2', 'v3,1']
e = Expressions(vars)
e.check('10+v1+v2+v3,1-whatever')

it prints whatever as a correct token despite it is not defined in vars. How can I solve this?


Solution

  • The ident variable defined in the Expressions class is not a placeholder, so when you assign it in your __init__ method, you are not changing the overall parser, just the definition of self.ident (which creates a new attribute on the Expressions instance, and does not change the class-level ident).

    Why not just define the entire parser in __init__? Then you can define ident using the given variable names, and you can bypass all the class-vs-instance attribute issues, and the updating-part-of-a-parser-after-the-fact issues.

    And what is this code supposed to be doing?

            ids = []
            for x in vars:
                ids.append(x)
    

    There are far easier ways to copy values from one list to another, but why are you even making a copy? Just define ident using the input list of var names (which you might name something besides vars, since this clashes with a useful builtin method - maybe call it var_names?).

    EDIT: a few more notes

    You need to fix fn_call. As it is, you will have infinite recursion because all it is is a comma-delimited list of arith_exprs. Since you are defining arith_expr using fn_call, there is a left-recursion. I think you incompletely copied this from another example, the expression you have is valid for the list of arguments that are in a function's arg list, but you are missing the function name and the enclosing parens. Add these and the recursion problem goes away.

    One of your vars is 'v_3,1'. This is an odd-looking identifier, but luckily it does not conflict with any other bits in your parser. But if you send in an identifier of '3.1' or '42', things will get very confusing. You might want to define a valid_identifier expression and then verify the incoming var names with something like:

    valid_expression = Word(alphas + '_', alphanums + '_,')
    if not all(valid_expression.matches(varname) for varname in varnames):
        raise WhatWereYouThinkingException("invalid identifier specified")