python parsing comparison-operators python-internals

How does three operands comparison work in Python under the hood?

Can you please explain what does the syntactic parse tree look like for a chained comparison?

As far as I understand in most of languages it constructs nodes based on operator associativity so in a < b < c there will be a boolean value either as a left- or right-hand operand.

But in Python such expression is almost equivalent to a < b and b < c (with b only evaluated once).

What is a generative grammar rule for such a transformation? Basically, what does a Python interpreter do to construct a parse tree in such case?

Solution

The comparison grammar is not that interesting here, it simply lets you append multiple comparators to an operator:

comparison    ::=  or_expr ( comp_operator or_expr )*
comp_operator ::=  "<" | ">" | "==" | ">=" | "<=" | "!="
                   | "is" ["not"] | ["not"] "in"

So lets ask the Python parser directly, by using the ast module (which simply asks the Python compiler itself to only return the Abstract Syntax Tree):

>>> import ast
>>> ast.dump(ast.parse('a > b > c', mode='eval'))
"Expression(body=Compare(left=Name(id='a', ctx=Load()), ops=[Gt(), Gt()], comparators=[Name(id='b', ctx=Load()), Name(id='c', ctx=Load())]))"

So there is just a single Compare node, with multiple operators and comparators:

Compare(
    left=Name(id='a'),
    ops=[Gt(), Gt()],
    comparators=[Name(id='b'), Name(id='c')])

(I omitted the Expression and ctx parts).

This lets the interpreter evaluate comparators as needed (e.g. if a < b is false, the remaining comparators don't have to be considered).

The resulting bytecode uses conditional jumps to skip remaining comparisons:

>>> import dis
>>> dis.dis(compile('a > b > c', '', 'eval'))
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 DUP_TOP
              6 ROT_THREE
              8 COMPARE_OP               4 (>)
             10 JUMP_IF_FALSE_OR_POP    18
             12 LOAD_NAME                2 (c)
             14 COMPARE_OP               4 (>)
             16 RETURN_VALUE
        >>   18 ROT_TWO
             20 POP_TOP
             22 RETURN_VALUE