I've a HTTP server written in Python that accepts a large binary file (>50MB) and performs some file related computation (decryption, decompression ...) on the file. I want to get a good estimate of the the amount of time it takes to execute these operations. My python server is running on a multi CPU and multi core server on Ubuntu 11.10.
Currently I'm just doing a time diff of (date.now()
to get the execution times for various operations. I know there are couple of Python modules that provide profiling capabilities. However, my understanding is they are limited to small code snippets only.
What are my other options ?
Thanks.
I'd say that cProfile is pretty solid, and definitely an improvement on using something like date.now(). Here is what cProfile produces for a comparison of a useless and slightly less useless fibonacci generator.
$ python -m cProfile script.py
4113777 function calls (1371305 primitive calls) in 1.337 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 cProfile.py:5(<module>)
1 0.000 0.000 0.000 0.000 cProfile.py:66(Profile)
1 0.009 0.009 1.337 1.337 script.py:1(<module>)
2692508/30 1.069 0.000 1.268 0.042 script.py:3(fib)
74994/25000 0.058 0.000 0.058 0.000 script.py:9(fibber)
1346269 0.200 0.000 0.200 0.000 {max}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.001 0.000 0.001 0.000 {range}
Here is the code I used:
import cProfile
def fib(num):
if num < 3:
return max(num, 1)
return fib(num - 1) + fib(num - 2)
fibdict = {0:1, 1:1, 2:2}
def fibber(num):
if num not in fibdict:
n = fibber(num - 1) + fibber(num - 2)
fibdict[num] = n
return fibdict[num]
a = [fib(i) for i in range(30)]
b = [fibber(i) for i in range(25000)]
cProfile pretty clearly tells me how slow fib is running, and gives me nice data on time per call / number of calls, as well as how much total time was spent in the method. Obviously this is trivial/toy code, but I regularly use cProfile to get a feel for where my code is spending the most time, and I have found it very effective for non-trivial code. I would imagine it would give you the data that you need.