Search code examples
pythonunit-testingpytesttrace

Hunt for root cause of flaky test in Python


There is a flaky test, and we have no clue what the root cause could be.

with pytest.raises(foo.geocoder.GeocodingFailed):
    foo.geocoder.geocode('not a valid place')

Sometimes the exception does not happen.

I looked at the docs how to handle failures, but this did not help.

How to trace geocode() so that if the exception does not happen, I see a trace?

I looked at the trace module of the standard library. But there seems to be no easy way to get the trace as string.

With "a trace" I mean: A trace of all lines which got executed during gecode(). I would like to see method calls and returns statements with indentation. I want to ignore lines from the Python standard library.

AFAIK a debugger like pdb does not help here, since the test only fails when it gets run in CI, and only once or twice per month.


Solution

  • The trace module gives programmatic access via trace.Trace class.
    On test failures the Trace class' console output is visible.
    And it has coverage report to write on chosen path.

    I exclude sys.base_exec_prefix, sys.base_prefix to not trace Python library modules.

    import pytest
    
    
    def f(x):
        import random
    
        choice = random.choice((True, False))
    
        if choice:
            raise ValueError
        else:
            return x * 2
    
    
    def trace(f, *args, **kwargs):
        import trace
        import sys
    
        tracer = trace.Trace(
            ignoredirs=(sys.base_exec_prefix, sys.base_prefix), timing=True,
        )
        ret = tracer.runfunc(f, *args, **kwargs)
        r = tracer.results()
        r.write_results(coverdir="/tmp/xx-trace")
        return ret
    
    
    def test_f():
        with pytest.raises(ValueError):
            trace(f, 3)
    

    pytest run

    Coverage report;
    Lines marked with >>>>>> were not executed i.e. not traced and the numbers with colon are execution counts.

    >>>>>> import pytest
           
           
    >>>>>> def f(x):
        1:     import random
           
        1:     choice = random.choice((True, False))
           
        1:     if choice:
    >>>>>>         raise ValueError
               else:
        1:         return x * 2
           
           
    >>>>>> def trace(f, *args, **kwargs):
    >>>>>>     import trace
    >>>>>>     import sys
           
    >>>>>>     tracer = trace.Trace(
    >>>>>>         ignoredirs=(sys.base_exec_prefix, sys.base_prefix), timing=True,
               )
    >>>>>>     tracer.runfunc(f, *args, **kwargs)
    >>>>>>     r = tracer.results()
    >>>>>>     r.write_results(coverdir="/tmp/xx-trace")
           
           
    >>>>>> def test_f():
    >>>>>>     with pytest.raises(ValueError):
    >>>>>>         trace(f, 3)