Search code examples
pythonpyparsing

pyparsing validation on single/double quotes on strings


I have a formula format which is something like this:

lhs = pp.Word(pp.alphas + pp.alphas8bit + pp.alphanums + "." + "_")
rhs = (
    pp.Optional(pp.oneOf("' \""))
    + pp.Word(pp.alphas + pp.alphas8bit + pp.alphanums + "." + "_" + "-" + ":")
    + pp.Optional(pp.oneOf("' \""))
)
expression = lhs + pp.oneOf("> < = >= <=") + rhs 

So, it works fine:

>>> print(expression.parseString("name = 'user1'"))
>>> ['name', '=', "'", 'user1', "'"]
>>> print(expression.parseString('user.id >= 10'))
>>> ['user.id', '>=', '10']

But, how to validate when the rhs starts with " it must be ended with " (double quote) as well and the same for ' (single quote). Also, when input starts with none of them (like Integer) it must end with none of them as well. But now it never validates them:

>>> print(expression.parseString("name = \"user1'"))  # invalid!
>>> ['name', '=', '"', 'user1', "'"]
>>> print(expression.parseString('user.id >= 10"'))  # invalid!
>>> ['user.id', '>=', '10', '"']

So, how to have a validation like this?


Solution

  • Try it:

    lhs = pp.Word(pp.alphas + pp.alphas8bit + pp.alphanums + "." + "_")
    rhs_str = pp.Word(pp.alphas + pp.alphas8bit + pp.alphanums + "." + "_" + "-" + ":")
    rhs_num = pp.Word(pp.nums)
    rhs = (
        (pp.Suppress('"') + rhs_str + pp.Suppress('"'))
        ^ (pp.Suppress("'") + rhs_str + pp.Suppress("'"))
        ^ (rhs_num)
    )
    expression = lhs + pp.oneOf("> < = >= <=") + rhs