Search code examples
pythonparsingcomparison-operatorspython-internals

How does three operands comparison work in Python under the hood?


Can you please explain what does the syntactic parse tree look like for a chained comparison?

As far as I understand in most of languages it constructs nodes based on operator associativity so in a < b < c there will be a boolean value either as a left- or right-hand operand.

But in Python such expression is almost equivalent to a < b and b < c (with b only evaluated once).

What is a generative grammar rule for such a transformation? Basically, what does a Python interpreter do to construct a parse tree in such case?


Solution

  • The comparison grammar is not that interesting here, it simply lets you append multiple comparators to an operator:

    comparison    ::=  or_expr ( comp_operator or_expr )*
    comp_operator ::=  "<" | ">" | "==" | ">=" | "<=" | "!="
                       | "is" ["not"] | ["not"] "in"
    

    So lets ask the Python parser directly, by using the ast module (which simply asks the Python compiler itself to only return the Abstract Syntax Tree):

    >>> import ast
    >>> ast.dump(ast.parse('a > b > c', mode='eval'))
    "Expression(body=Compare(left=Name(id='a', ctx=Load()), ops=[Gt(), Gt()], comparators=[Name(id='b', ctx=Load()), Name(id='c', ctx=Load())]))"
    

    So there is just a single Compare node, with multiple operators and comparators:

    Compare(
        left=Name(id='a'),
        ops=[Gt(), Gt()],
        comparators=[Name(id='b'), Name(id='c')])
    

    (I omitted the Expression and ctx parts).

    This lets the interpreter evaluate comparators as needed (e.g. if a < b is false, the remaining comparators don't have to be considered).

    The resulting bytecode uses conditional jumps to skip remaining comparisons:

    >>> import dis
    >>> dis.dis(compile('a > b > c', '', 'eval'))
      1           0 LOAD_NAME                0 (a)
                  2 LOAD_NAME                1 (b)
                  4 DUP_TOP
                  6 ROT_THREE
                  8 COMPARE_OP               4 (>)
                 10 JUMP_IF_FALSE_OR_POP    18
                 12 LOAD_NAME                2 (c)
                 14 COMPARE_OP               4 (>)
                 16 RETURN_VALUE
            >>   18 ROT_TWO
                 20 POP_TOP
                 22 RETURN_VALUE