I have the following basic python function:
def squared(num):
if num < 2:
print ('OK!')
return num * num
Which produces the following bytecode:
>>> dis.dis(squared)
2 0 LOAD_FAST 0 (num)
3 LOAD_CONST 1 (2)
6 COMPARE_OP 0 (<)
9 POP_JUMP_IF_FALSE 20
3 12 LOAD_CONST 2 ('OK!')
15 PRINT_ITEM
16 PRINT_NEWLINE
17 JUMP_FORWARD 0 (to 20)
4 >> 20 LOAD_FAST 0 (num)
23 LOAD_FAST 0 (num)
26 BINARY_MULTIPLY
27 RETURN_VALUE
Most of the above look like mov
and jmp
-type operators. However, what do the following most closely mean in assembly?
LOAD_FAST
, LOAD_CONST
?
What might be the closest assembly instructions for this?
Python's bytecode is for a stack-based VM to simplify interpreters and waste less space on addresses (e.g. register numbers) by making them implicit. Loads are pushing onto that stack.
If you transliterated this in a very literal and braindead manner to an asm equivalent for fixed-width integers (unlike Python arbitrary precision integers), every LOAD_FAST might be a load from a local variable (in memory on the stack) into a register. But you'd still have to choose which registers, and real ISAs have a limited number of registers. But yes, LOAD_FAST is like a load.
Of course if you weren't intentionally being literal just for the sake of it, you'd know you already had num
in a register and not load it again. So you'd use an instruction that read the same register twice, like imul eax, eax
And you'd have local variables living in registers when possible, not spilling them to the stack in the first place until you run out of registers.
If you want to learn about asm for CPUs, you can write an equivalent function in C and compile it (with optimization enabled, at least -Og
if not -O2
or -O3
) on the Godbolt compiler explorer: https://godbolt.org/. Or on your desktop, but Matt Godbolt wrote some nice filtering to remove noise and leave only the interesting parts. See also How to remove "noise" from GCC/clang assembly output?