Search code examples
pythonpython-3.xunit-testingexecution

Python Code: Information on Execution Trace of loops/conditionals


I want to get the execution trace of a python function in terms of the loops and conditionals executed upon completion. However, I want to do this without instrumenting the original python function with additional parameters. For example:

def foo(a: int, b: int):
    while a:
        a = do_something()
        if b:
            a = do_something()


if __name__ == "__main__":
    foo(a, b)

After the execution of foo() I want a execution trace something like: [while: true, if:false, while: true, if: true, while: false, ...] which documents the sequence of conditional evaluations in the code. Is there any way to get this information automatically for an arbitrary python function?

I understand "Coverage" python module returns the "Branch coverage" information. But I am unsure how to use it in this context?


Solution

  • You can use as a starting point trace_conditions.py and modify it if needed.

    Example

    foo function that defined in the question is used in the example below:

    from trace_conditions import trace_conditions
    
    # (1) This will just print conditions
    traced_foo = trace_conditions(foo)
    traced_foo(a, b)
    # while c -> True
    # if d -> True
    # ...
    
    # (2) This will return conditions
    traced_foo = trace_conditions(foo, return_conditions=True)
    result, conditions = traced_foo(a, b)
    # conditions = [('while', 'c', True), ('if', 'd', True), ...)]
    

    Note: ast.unparse is used to get string representation of condition. It was introduced in Python 3.9. If you want to use older version of Python, perhaps you will want to install 3rd party package astunparse and then use it in function _condition_to_string. Otherwise trace_conditions will not return string representation of conditions.

    TL;DR

    Idea

    Basically, we want to programmatically add catchers to the function's code. For example, print catchers could look like this:

    while x > 5:
        print('while x > 5', x > 5)  # <-- print condition after while
        # do smth
    
    print('if x > 5', x > 5)  # <-- print condition before if
    if x > 5:
        # do smth
    

    So, the main idea is to use code introspection tools in python (inspect, ast, exec).

    Implementation

    Here I will briefly explain the code in trace_conditions.py:

    Main function trace_conditions

    The main function is self-explanatory and simply reflects the whole algorithm: (1) build syntactic tree; (2) inject condition catchers; (3) compile new function.

    def trace_conditions(
            func: Callable, return_conditions=False):
        catcher_type = 'yield' if return_conditions else 'print'
    
        tree = _build_syntactic_tree(func)
        _inject_catchers(tree, catcher_type)
        func = _compile_function(tree, globals_=inspect.stack()[1][0].f_globals)
    
        if return_conditions:
            func = _gather_conditions(func)
        return func
    

    The only thing that requires explanation is globals_=inspect.stack()[1][0].f_globals. In order to compile a new function we need to give python all modules that are used by that function (for example, it may use math, numpy, django, etc...). And inspect.stack()[1][0].f_globals simply takes everything what imported in the module of the calling function.

    Caveat!

    # math_pi.py
    import math
    
    def get_pi():
       return math.pi
    
    
    # test.py
    from math_pi import get_pi
    from trace_conditions import trace_conditions
    
    traced = trace_conditions(get_pi)
    traced()  # Error! Math is not imported in this module
    

    To solve it you can either modify code in trace_conditions.py or just add import math in test.py

    _build_syntactic_tree

    Here we are first getting the source code of function using inspect.getsource and then parse it in syntactic tree using ast.parse. Unfortunately, python cannot inspect source code of function if it is called from decorator, so it seems with this approach it is not possible to use convenient decorators.

    _inject_catchers

    In this function we traverse given syntactic tree, find while and if statements and then inject catchers before or after them. ast module has method walk, but it returns only node itself (without parent), so I implemented slightly changed version of walk that returns parent node as well. We need to know parent if we want to insert catcher before if.

    def _inject_catchers(tree, catcher_type):
        for parent, node in _walk_with_parent(tree):
            if isinstance(node, ast.While):
                _catch_after_while(node, _create_catcher(node, catcher_type))
            elif isinstance(node, ast.If):
                _catch_before_if(parent, node, _create_catcher(node, catcher_type))
        ast.fix_missing_locations(tree)
    

    At the end we call ast.fix_missing_locations function that helps to fill in correctly technical fields like lineno and others that required in order to compile code. Usually, you need to use it, when you modify syntactic tree.

    Catching elif statement

    The funny stuff is that python doesn't have in its ast grammar elif statement, so it has just if-else statements. The ast.If node has field body that contains expressions of if body and field orelse that contains expressions of else block. And elif case is simply represented by ast.If node inside orelse field. This fact reflected in the function _catch_before_if.

    Catchers (and _gather_conditions)

    There are several ways how you could catch conditions, the most simple is to just print it, but this approach will not work if you want handle them later in python code. One straightforward way is to have a global empty list in which you will append condition and its value during execution of function. However, I think that this solution introduces a new name in the namespace that potentially can clutter with local names inside a function, so I decided that it should be more safe to yield conditions and its information.

    The function _gather_conditions is adding a wrapper around function with injected yield statements, that simply gathers all yielded conditions and returns result of function and conditions.