Search code examples
pythonbytecodedisassemblypython-internals

Why does a class definition always produce the same bytecode?


Say I do:

#!/usr/bin/env python
# encoding: utf-8

class A(object):
    pass

Now I disassemble it:

python -m dis test0.py 
  4           0 LOAD_CONST               0 ('A')
              3 LOAD_NAME                0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               1 (<code object A at 0x1004ebb30, file "test0.py", line 4>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_NAME               1 (A)
             22 LOAD_CONST               2 (None)
             25 RETURN_VALUE        

Now I add some statements in the class definition:

#!/usr/bin/env python
# encoding: utf-8

class A(object):
    print 'hello'
    1+1
    pass

And I disassemble again:

  4           0 LOAD_CONST               0 ('A')
              3 LOAD_NAME                0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               1 (<code object A at 0x1004ebb30, file "test0.py", line 4>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_NAME               1 (A)
             22 LOAD_CONST               2 (None)
             25 RETURN_VALUE        

What don't the new statements appear in the new bytecode?


Solution

  • The new statements are stored in nested bytecode. You can see in your disassembly that another code object is loaded:

          9 LOAD_CONST               1 (<code object A at 0x1004ebb30, file "test0.py", line 4>)
    

    You need to inspect that code object instead. That's because the class body is executed just like a function object, and the local namespace that call produces is then used to form the class members.

    Demo:

    >>> import dis
    >>> def wrapper():
    ...     class A(object):
    ...         pass
    ... 
    >>> dis.dis(wrapper)
      2           0 LOAD_CONST               1 ('A')
                  3 LOAD_GLOBAL              0 (object)
                  6 BUILD_TUPLE              1
                  9 LOAD_CONST               2 (<code object A at 0x104b99930, file "<stdin>", line 2>)
                 12 MAKE_FUNCTION            0
                 15 CALL_FUNCTION            0
                 18 BUILD_CLASS         
                 19 STORE_FAST               0 (A)
                 22 LOAD_CONST               0 (None)
                 25 RETURN_VALUE        
    >>> dis.dis(wrapper.__code__.co_consts[2])
      2           0 LOAD_NAME                0 (__name__)
                  3 STORE_NAME               1 (__module__)
    
      3           6 LOAD_LOCALS         
                  7 RETURN_VALUE        
    

    This is the same setup as your first sample; the class body is accessed via the wrapper.__code__.co_consts tuple, which is what the LOAD_CONST byte code refers to; the index is given as 2.

    Now we can add a class body:

    >>> def wrapper():
    ...     class A(object):
    ...         print 'hello'
    ...         1+1
    ...         pass
    ... 
    >>> dis.dis(wrapper)
      2           0 LOAD_CONST               1 ('A')
                  3 LOAD_GLOBAL              0 (object)
                  6 BUILD_TUPLE              1
                  9 LOAD_CONST               2 (<code object A at 0x104b4adb0, file "<stdin>", line 2>)
                 12 MAKE_FUNCTION            0
                 15 CALL_FUNCTION            0
                 18 BUILD_CLASS         
                 19 STORE_FAST               0 (A)
                 22 LOAD_CONST               0 (None)
                 25 RETURN_VALUE        
    >>> dis.dis(wrapper.__code__.co_consts[2])
      2           0 LOAD_NAME                0 (__name__)
                  3 STORE_NAME               1 (__module__)
    
      3           6 LOAD_CONST               0 ('hello')
                  9 PRINT_ITEM          
                 10 PRINT_NEWLINE       
    
      4          11 LOAD_CONST               2 (2)
                 14 POP_TOP             
    
      5          15 LOAD_LOCALS         
                 16 RETURN_VALUE        
    

    Now the class body appears; we can see the byte code that'll be executed when the class body is loaded.

    Of note are the LOAD_NAME and STORE_NAME bytecodes executed for each class body; those retrieve the module name and store those as a new local name __module__, so that your class will end up with a __module__ attribute once created.

    The LOAD_LOCALS bytecode then gathers all the local names produced in this 'function' and returns that to the caller, so that the BUILD_CLASS bytecode can use that together with the 'A' string and the object bases tuple (created with BUILD_TUPLE) can produce your new class object.