Search code examples
pythonctypesbytecodecpythonpdb

Debug the CPython opcode stack


CPython 3.7 introduced the ability to step through individual opcodes in a debugger. However, I can't figure out how to read variables out of the bytecode stack.

For example, when debugging

def f(a, b, c):
    return a * b + c

f(2, 3, 4)

I want to find out that the inputs of the addition are 6 and 4. Note how 6 never touches locals().

So far I could only come up with the opcode information, but I don't know how to get the opcode inputs:

import dis
import sys


def tracefunc(frame, event, arg):
    frame.f_trace_opcodes = True
    print(event, frame.f_lineno, frame.f_lasti, frame, arg)
    if event == "call":
        dis.dis(frame.f_code)
    elif event == "opcode":
        instr = next(
            i for i in iter(dis.Bytecode(frame.f_code))
            if i.offset == frame.f_lasti
        )
        print(instr)
    print("-----------")
    return tracefunc


def f(a, b, c):
    return a * b + c


sys.settrace(tracefunc)
f(2, 3, 4)

Output:

call 19 -1 <frame at 0x7f97df618648, file 'test_trace.py', line 19, code f> None
 20           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 BINARY_MULTIPLY
              6 LOAD_FAST                2 (c)
              8 BINARY_ADD
             10 RETURN_VALUE
-----------
line 20 0 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
-----------
opcode 20 0 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
Instruction(opname='LOAD_FAST', opcode=124, arg=0, argval='a', argrepr='a', offset=0, starts_line=20, is_jump_target=False)
-----------
opcode 20 2 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
Instruction(opname='LOAD_FAST', opcode=124, arg=1, argval='b', argrepr='b', offset=2, starts_line=None, is_jump_target=False)
-----------
opcode 20 4 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
Instruction(opname='BINARY_MULTIPLY', opcode=20, arg=None, argval=None, argrepr='', offset=4, starts_line=None, is_jump_target=False)
-----------
opcode 20 6 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
Instruction(opname='LOAD_FAST', opcode=124, arg=2, argval='c', argrepr='c', offset=6, starts_line=None, is_jump_target=False)
-----------
opcode 20 8 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=8, starts_line=None, is_jump_target=False)
-----------
opcode 20 10 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> None
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=10, starts_line=None, is_jump_target=False)
-----------
return 20 10 <frame at 0x7f97df618648, file 'test_trace.py', line 20, code f> 10
-----------

Solution

  • TLDR

    You can inspect CPython's inter-opcode state using a C-extension, gdb, or using dirty tricks (examples below).

    Background

    CPython's bytecode is run by a stack machine. That means that all state between opcodes is kept in a stack of PyObject*s.

    Let's take a quick look at CPython's frame object:

    typedef struct _frame {
        PyObject_VAR_HEAD
        struct _frame *f_back;      /* previous frame, or NULL */
        PyCodeObject *f_code;       /* code segment */
        ... // More fields
        PyObject **f_stacktop;
        ... // More fields
    } PyFrameObject;
    

    See the PyObject **f_stacktop right near the end? This is a pointer to the top of this stack. Most (if not all?) CPython's opcodes use that stack to get parameters and store results.

    For example, let's take a look at the implementation for BINARY_ADD (addition with two operands):

    case TARGET(BINARY_ADD): {
        PyObject *right = POP();
        PyObject *left = TOP();
        ... // sum = right + left
        SET_TOP(sum);
        ...
    }
    

    It pops two values from the stack, add them up and puts the result back in the stack.

    Inspecting the stack

    Down to C level - C extension or GDB

    As we saw above, CPython's frame objects are native - PyFrameObject is a struct, and frameobject.c defines the pythonic interface allowing to read (and sometimes write) some of its members.

    Specifically, the member f_stacktop is not exposed in python, so to access this member and read the stack you'll have to write some code in C or use GDB.

    Specifically, if you're writing a debugging-utils library, I'd recommend writing a C extension, which will allow you to write some basic primitives in C (like getting the current stack as a list of python objects), and the rest of the logic in python.

    If it's a one time thing, you could probably try playing around with GDB and inspect the stack.

    When you don't have a compiler - using pure python

    The plan: find the address of the stack and read the numbers stored in it from memory - in python!

    First, we need to be able to find the offset of f_stacktop in the frame object. I installed a debugging version of python (on my ubuntu it's apt install python3.7-dbg). This package includes a python binary that contains debugging symbols (some information about the program made to help debuggers).

    dwarfdump is a utility that can read and display debugging symbols (DWARF is a common debugging-info format used mostly in ELF binaries). Running dwarfdump -S any=f_stacktop -Wc /usr/bin/python3.7-dbg provides us with the following output:

    DW_TAG_member
        DW_AT_name                  f_stacktop
        DW_AT_decl_file             0x00000034 ./build-debug/../Include/frameobject.h
        DW_AT_decl_line             0x0000001c
        DW_AT_decl_column           0x00000010
        DW_AT_type                  <0x00001969>
        DW_AT_data_member_location  88
    

    DW_AT_data_member_location sounds like the offset of f_stacktop!

    Now let's write some code:

    #!/usr/bin/python3.7-dbg
    from ctypes import sizeof, POINTER, py_object
    # opname is a list of opcode names, where the indexes are the opcode numbers
    from opcode import opname
    import sys 
    
    # The offset we found using dwarfdump
    F_STACKTOP = 88
    
    def get_stack(frame):
        # Getting the address of the stack by adding
        # the address of the frame and the offset of the member
        f_stacktop_addr = id(frame) + F_STACKTOP
        # Initializing a PyObject** directly from memory using ctypes
        return POINTER(py_object).from_address(f_stacktop_addr)
    
    def tracefunc(frame, event, arg):
        frame.f_trace_opcodes = True
        if event == 'opcode':
            # frame.f_code.co_code is the raw bytecode
            opcode = frame.f_code.co_code[frame.f_lasti]
            if opname[opcode] == 'BINARY_ADD':
                stack = get_stack(frame)
                # According to the implementation of BINARY_ADD,
                # the last two items in the stack should be the addition operands
                print(f'{stack[-2]} + {stack[-1]}')
        return tracefunc
    
    def f(a, b, c): 
        return a * b + c 
    
    sys.settrace(tracefunc)
    f(2, 3, 4)
    

    The ouput: 6 + 4! Great success! (said with satisfied Borat voice)

    This code is not portable yet, because F_STACKTOP will vary between python binaries. To fix that you could use ctypes.Structure to create a frame object structure and easily get the value of the f_stacktop member in a more portable fashion.

    Note that doing that will hopefully make your code platform-independent, but it will not make it python-implementation-independent. Code like that might only work with the CPython version you wrote it with originally. This is because to create a ctypes.Structure subclass, you will have to rely on CPython's implementation of frame objects (or more specifically, on PyFrameObject's members' types and order).