Search code examples
pythonassemblybytecodepyc

Closest assembly related to python bytecode


I have the following basic python function:

def squared(num):
    if num < 2:
        print ('OK!')
    return num * num

Which produces the following bytecode:

>>> dis.dis(squared)
  2           0 LOAD_FAST                0 (num)
              3 LOAD_CONST               1 (2)
              6 COMPARE_OP               0 (<)
              9 POP_JUMP_IF_FALSE       20

  3          12 LOAD_CONST               2 ('OK!')
             15 PRINT_ITEM          
             16 PRINT_NEWLINE       
             17 JUMP_FORWARD             0 (to 20)

  4     >>   20 LOAD_FAST                0 (num)
             23 LOAD_FAST                0 (num)
             26 BINARY_MULTIPLY     
             27 RETURN_VALUE        

Most of the above look like mov and jmp-type operators. However, what do the following most closely mean in assembly?

LOAD_FAST, LOAD_CONST ?

What might be the closest assembly instructions for this?


Solution

  • Python's bytecode is for a stack-based VM to simplify interpreters and waste less space on addresses (e.g. register numbers) by making them implicit. Loads are pushing onto that stack.

    If you transliterated this in a very literal and braindead manner to an asm equivalent for fixed-width integers (unlike Python arbitrary precision integers), every LOAD_FAST might be a load from a local variable (in memory on the stack) into a register. But you'd still have to choose which registers, and real ISAs have a limited number of registers. But yes, LOAD_FAST is like a load.

    Of course if you weren't intentionally being literal just for the sake of it, you'd know you already had num in a register and not load it again. So you'd use an instruction that read the same register twice, like imul eax, eax

    And you'd have local variables living in registers when possible, not spilling them to the stack in the first place until you run out of registers.


    If you want to learn about asm for CPUs, you can write an equivalent function in C and compile it (with optimization enabled, at least -Og if not -O2 or -O3) on the Godbolt compiler explorer: https://godbolt.org/. Or on your desktop, but Matt Godbolt wrote some nice filtering to remove noise and leave only the interesting parts. See also How to remove "noise" from GCC/clang assembly output?