Search code examples
pythonhookcpythontrace

Is there any way to execute code on function creation in CPython?


Is there any way to hook the CPython interpreter so that every function creation (def, lambda) results in a call to a procedure that I've defined? sys.settrace and sys.setprofile unfortunately don't seem to cover both def and lambda.

Update:

It seems Python 3.7 has f_trace_opcodes... is there any option for earlier versions?


Solution

  • There is no equivalent to opcode tracing in versions before 3.7. If there were, the feature wouldn't have been added to 3.7 in the first place.

    If you can upgrade to 3.7, then what you want is easy:

    def tracefunc(frame, event, arg):
        if event == 'call':
            frame.f_trace_opcodes = True
        elif event == 'opcode':
            if frame.f_code.co_code[frame.f_lasti] == dis.opmap['MAKE_FUNCTION']:
                makefunctiontracefunc(frame)
        return tracefunc
    sys.settrace(tracefunc)
    

    But if you can't… there are a number of more complicated things you could do, depending on what your reasons are for wanting this, but none of them are remotely easy:

    • Use line tracing, and inspect the code until the next line. This is trivial for def, but for lambda (and comprehensions1) it's going to be a big pain, because a lambda (or even five of them) can appear in the middle of a statement. You could ast.parse the source, or examine the bytecode, to figure out that there are functions being defined within, but there's still no way to call your hook right at the time of definition.
    • Instead of using tracing, write an import hook that modifies the code as it's being imported. The easy way to do this is probably at the AST level: after you parse the source, use a NodeTransformer to inject calls to some function2 before or after each def and lambda node, then compile the transformed tree. But you could also do it at the bytecode level with bytecode or byteplay, before or after each MAKE_FUNCTION.3
    • Script pdb instead of writing your own debugger. I'm not sure if this will even help, because pdb has no way to step through part of an expression in the first place.
    • Debug CPython itself, and add a breakpoint in the MAKE_FUNCTION handler in the ceval loop that calls your code. Of course your code is in the debugger's interpreter—which can be Python for gdb and lldb, but it's still not the same Python interpreter you're debugging. And, while it's possible to recursively evaluate code into the debugged interpreter (or trigger its pdb), it's not easy, and you segfault all over the place while working it out.

    1. Comprehensions (except list comprehensions, in 2.x) are implemented by defining and then calling a function. So, any of the methods that rely on the MAKE_FUNCTION opcode or similar are going to also fire on comprehensions, while those that rely on source or AST parsing will not (unless you do so explicitly, of course).

    2. Obviously you also need to inject an import at the top of every module to make that function available, or inject the function into the builtins module.

    3. And MAKE_CLOSURE, for earlier versions of Python.