Search code examples
pythonhttpprofilingmulticoreexecution

Options to profile server side execution times in python code


I've a HTTP server written in Python that accepts a large binary file (>50MB) and performs some file related computation (decryption, decompression ...) on the file. I want to get a good estimate of the the amount of time it takes to execute these operations. My python server is running on a multi CPU and multi core server on Ubuntu 11.10.

Currently I'm just doing a time diff of (date.now() to get the execution times for various operations. I know there are couple of Python modules that provide profiling capabilities. However, my understanding is they are limited to small code snippets only.

What are my other options ?

Thanks.


Solution

  • I'd say that cProfile is pretty solid, and definitely an improvement on using something like date.now(). Here is what cProfile produces for a comparison of a useless and slightly less useless fibonacci generator.

    $ python -m cProfile script.py
             4113777 function calls (1371305 primitive calls) in 1.337 seconds
    
       Ordered by: standard name
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
            1    0.000    0.000    0.000    0.000 cProfile.py:5(<module>)
            1    0.000    0.000    0.000    0.000 cProfile.py:66(Profile)
            1    0.009    0.009    1.337    1.337 script.py:1(<module>)
    2692508/30    1.069    0.000    1.268    0.042 script.py:3(fib)
    74994/25000    0.058    0.000    0.058    0.000 script.py:9(fibber)
      1346269    0.200    0.000    0.200    0.000 {max}
            1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
            2    0.001    0.000    0.001    0.000 {range}
    

    Here is the code I used:

    import cProfile
    
    def fib(num):
        if num < 3:
            return max(num, 1)
        return fib(num - 1) + fib(num - 2)
    
    fibdict = {0:1, 1:1, 2:2}
    def fibber(num):
        if num not in fibdict:
            n = fibber(num - 1) + fibber(num - 2)
            fibdict[num] = n
        return fibdict[num]
    
    a = [fib(i) for i in range(30)]
    b = [fibber(i) for i in range(25000)]
    

    cProfile pretty clearly tells me how slow fib is running, and gives me nice data on time per call / number of calls, as well as how much total time was spent in the method. Obviously this is trivial/toy code, but I regularly use cProfile to get a feel for where my code is spending the most time, and I have found it very effective for non-trivial code. I would imagine it would give you the data that you need.