In a large project, I encountered a function which is quite slow (speaking of seconds to minutes execution time). The function does a lot of things and has a very deep stacktrace. While only a few classes are involved in the execution of this function, it is not too obvious where the long runtime comes from.
I started to debug the function, tracing calls etc and found the trace package quite useful. With this I could identify some functions that would assemble lists over and over again, which actually lead to a speedup of about 3x when saving the lists after the first execution.
But now I can not really see any more obvious parts, where the function can be optimized as the trace package produces several megabytes of text and I can not spot anything that looks suspicious to me.
I thought about use the timing option of the trace, to give me some overview about the runtime, to see which functions might be slow - but the amount of data is just too large, so a summery would be nice where the total execution time of each call is listed, but it seems this is not supported by the trace package?
Yet another question is, on which level I'd like to get the execution times. It is rather not the case that a single statement is slow, but a whole function is called quite often or the data is not saved... So what I eventually would need is the average execution time per statement multiplied by the count. The later one can be generated by the trace package.
Is there eventually any other tool besides pdb and trace I could use?
Have you tried profiling the code? Here is an example that uses cProfile to gather summary statistics on how long different functions executed:
import cProfile, pstats, StringIO
import time
# simulate a delay
def delay(ms):
startms = int(round(time.time() * 1000))
while (int(round(time.time() * 1000)) - startms <= ms):
pass
def foo1():
delay(100)
def foo2():
for x in range(10):
foo1()
def foo3():
for x in range(20):
foo1()
def foo4():
foo2()
foo3()
if __name__ == '__main__':
pr = cProfile.Profile()
pr.enable() # start profiling
foo4()
pr.disable() # end profiling
s = StringIO.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print s.getvalue()
And here is the output:
4680454 function calls in 3.029 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 3.029 3.029 C:\Temp\test.py:21(foo4)
30 0.000 0.000 3.029 0.101 C:\Temp\test.py:10(foo1)
30 2.458 0.082 3.029 0.101 C:\Temp\test.py:5(delay)
1 0.000 0.000 2.020 2.020 C:\Temp\test.py:17(foo3)
1 0.000 0.000 1.010 1.010 C:\Temp\test.py:13(foo2)
2340194 0.308 0.000 0.308 0.000 {round}
2340194 0.263 0.000 0.263 0.000 {time.time}
2 0.000 0.000 0.000 0.000 {range}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}