Can you please explain what does the syntactic parse tree look like for a chained comparison?
As far as I understand in most of languages it constructs nodes based on operator associativity so in a < b < c
there will be a boolean value either as a left- or right-hand operand.
But in Python such expression is almost equivalent to a < b and b < c
(with b
only evaluated once).
What is a generative grammar rule for such a transformation? Basically, what does a Python interpreter do to construct a parse tree in such case?
The comparison grammar is not that interesting here, it simply lets you append multiple comparators to an operator:
comparison ::= or_expr ( comp_operator or_expr )*
comp_operator ::= "<" | ">" | "==" | ">=" | "<=" | "!="
| "is" ["not"] | ["not"] "in"
So lets ask the Python parser directly, by using the ast
module (which simply asks the Python compiler itself to only return the Abstract Syntax Tree):
>>> import ast
>>> ast.dump(ast.parse('a > b > c', mode='eval'))
"Expression(body=Compare(left=Name(id='a', ctx=Load()), ops=[Gt(), Gt()], comparators=[Name(id='b', ctx=Load()), Name(id='c', ctx=Load())]))"
So there is just a single Compare
node, with multiple operators and comparators:
Compare(
left=Name(id='a'),
ops=[Gt(), Gt()],
comparators=[Name(id='b'), Name(id='c')])
(I omitted the Expression
and ctx
parts).
This lets the interpreter evaluate comparators as needed (e.g. if a < b
is false, the remaining comparators don't have to be considered).
The resulting bytecode uses conditional jumps to skip remaining comparisons:
>>> import dis
>>> dis.dis(compile('a > b > c', '', 'eval'))
1 0 LOAD_NAME 0 (a)
2 LOAD_NAME 1 (b)
4 DUP_TOP
6 ROT_THREE
8 COMPARE_OP 4 (>)
10 JUMP_IF_FALSE_OR_POP 18
12 LOAD_NAME 2 (c)
14 COMPARE_OP 4 (>)
16 RETURN_VALUE
>> 18 ROT_TWO
20 POP_TOP
22 RETURN_VALUE