Search code examples
pythonparsingpyparsing

python calculating parsed logical expression (without eval)


I'm looking for the best way of calculating a string expression that holds a logical expression like: x and not y or (z or w) or a json logic like: {"and": {"x": "true", "y": "false"}, "or": {"or": {"z": "true", "w": "True"}}}

I decide how to config file should look like so the format could be changed, it's just an example.

I saw a way to parse the first string using pyparse to something like: ['x', 'AND', 'y', 'OR', ['z', 'OR', 'W']] but I don't know how to get the value of the expression after I parsed it.

About the json logic format I saw the package json-logic tried it but it seems like the code is broken since I can't run any of their examples so if there is something else I think that would be great.

x = True
y = False
z = False
w = False
calc_string_as_bool('x and not y or (z or w)') -> True

Thanks for the help.


Solution

  • If you want to use Python's parser but not its evaluator, you can do that. You'll have to evaluate the expression yourself, but if you only need to handle boolean expressions, that won't be too difficult. In any event, it's a reasonable way to learn how to write an evaluator without having to worry about parsing.

    To parse an expression, you can use ast.parse with mode='eval'; that will return a AST without evaluating anything.

    Evaluating an AST is generally done with a depth-first scan, which is easy enough to write, but the ast module comes with the NodeVisitor class which takes care of some of the details. You need to create your own subclass of NodeVisitor in which you define methods named visit_X for each AST node type X. These methods can return results, so typically an evaluator will just define a visitor method for the particular AST node types which can be evaluated; these method will evaluate each child node using the subclass's visit method, combine the values as appropriate, and return the result.

    That was probably not very clear, so I'll put an an example below.

    Important: the AST nodes are all documented in Python's library reference manual. You'll need to refer to that document constantly while you're writing your evaluator. Make sure that you are using the version of the documentation which corresponds to your Python version, since the AST structure does change from time to time.

    Here's a simple evaluator (honest! it's mostly comments) which handles the boolean operators and, if and not. It handles variables by looking their names up in a dictionary (which must be provided when you construct the evaluator object), and it knows about constant values. Anything it doesn't understand causes it to raise an exception, and I tried to be fairly strict.

    import ast
    class BooleanEvaluator(ast.NodeVisitor):
        def __init__(self, symtab = None):
            '''Create a new Boolean Evaluator.
               If you want to allow named variables, give the constructor a
               dictionary which maps names to values. Any name not in the 
               dictionary will provoke a NameError exception, but you could
               use a defaultdict to provide a default value (probably False).
               You can modify the symbol table after the evaluator is
               constructed, if you want to evaluate an expression with different
               values.
            '''
            self.symtab = {} if symtab is None else symtab
    
        # Expression is the top-level AST node if you specify mode='eval'.
        # That's not made very clear in the documentation. It's different
        # from an Expr node, which represents an expression statement (and
        # there are no statements in a tree produced with mode='eval').
        def visit_Expression(self, node):
            return self.visit(node.body)
    
        # 'and' and 'or' are BoolOp, and the parser collapses a sequence of
        # the same operator into a single AST node. The class of the .op
        # member identifies the operator, and the .values member is a list 
        # of expressions.
        def visit_BoolOp(self, node):
            if isinstance(node.op, ast.And):
                return all(self.visit(c) for c in node.values)
            elif isinstance(node.op, ast.Or):
                return any(self.visit(c) for c in node.values)
            else:
                # This "shouldn't happen".
                raise NotImplementedError(node.op.__doc__ + " Operator")
    
        # 'not' is a UnaryOp. So are a number of other things, like unary '-'.
        def visit_UnaryOp(self, node):
            if isinstance(node.op, ast.Not):
                 return not self.visit(node.operand)
            else:
                # This error can happen. Try using the `~` operator.
                raise NotImplementedError(node.op.__doc__ + " Operator")
    
        # Name is a variable name. Technically, we probably should check the
        # ctx member, but unless you decide to handle the walrus operator you
        # should never see anything other than `ast.Load()` as ctx.
        # I didn't check that the symbol table contains a boolean value,
        # but you could certainly do that.
        def visit_Name(self, node):
            try:
                return self.symtab[node.id]
            except KeyError:
                raise NameError(node.id)
    
        # The only constants we're really interested in are True and False,
        # but you could extend this to handle other values like 0 and 1
        # if you wanted to be more liberal
        def visit_Constant(self, node):
            if isinstance(node.value, bool):
                return node.value
            else:
                # I avoid calling str on the value in case that executes
                # a dunder method.
                raise ValueError("non-boolean value")
    
        # The `generic_visit` method is called for any AST node for
        # which a specific visitor method hasn't been provided.
        def generic_visit(self, node):
            raise RuntimeError("non-boolean expression")
    

    Here's a sample of how to use the evaluator. This function takes an expression which uses exactly three variables, which must be named x, y and z, and produces a truth table.

    import itertools
    def truthTable(expr):
        evaluator = BooleanEvaluator()
        theAST = ast.parse(expr, mode='eval')
        print("x  y  z  " + expr) 
        for x, y, z in itertools.product([True, False], repeat=3):
            evaluator.symtab = {'x': x, 'y': y, 'z': z}
            print('  '.join("FT"[b] for b in (x, y, z, evaluator.visit(theAST))))
    
    truthTable('x and (y or not z) and (not y or z)')