python python-3.x function python-internals

Function call execution speed is faster than non-function call

Function call will always yield some overhead. But why the code below shows that non-function call is slower.

Code:

import time

def s():
    for i in range(1000000000):
        1 + 1

t = time.time()
s()
print("Function call: " + str(time.time() - t))

t = time.time()
for i in range(1000000000):
    1 + 1
print("Non function call: " + str(time.time() - t))

Output:

Function call: 38.39736223220825
Non function call: 60.33238506317139

Solution

You might be thinking that since the loop only does 1 + 1, there shouldn't be much difference. But, there's a 'hidden' assignment here that's commonly forgotten: to the loop variable i in your for loop. This is the cause of the slowdown.

In functions, this is done with STORE_FAST. In the top level, it's done with STORE_NAME. The first is faster than the other and, in a loop that runs 1000000000 times, this difference is shown quite clearly.

Remember that the function invocation only happens once. So its overhead doesn't really contribute in this specific scenario.

Besides that, all other steps happen once and are pretty much the same. A range is created and its iterator grabbed and the constant 2 is loaded for every iteration.

You can always use the dis module to examine the CPython bytecode that is produced for each of these, as @Moses indicated in a comment. For the function s, you have:

dis.dis(s)
#       snipped for brevity
        >>   10 FOR_ITER                 8 (to 20)
             12 STORE_FAST               0 (i)

  3          14 LOAD_CONST               3 (2)
             16 POP_TOP
             18 JUMP_ABSOLUTE           10

While for the top-level version of the loop:

dis('for i in range(1000000000): 1+1')
#       snipped for brevity
        >>   10 FOR_ITER                 8 (to 20)
             12 STORE_NAME               1 (i)
             14 LOAD_CONST               3 (2)
             16 POP_TOP
             18 JUMP_ABSOLUTE           10

The main difference between these is in the storing of the iteration value i. In functions, it's simply more efficient.

To address @Reblochon Masque (now deleted) answer that seems to show no discrepancy between these two when timed with timeit in IPython cells.

timeit times things by creating a little-function (named inner) that stores the statements you pass and executes them for a given number of executions. You can see this if you create a Timer object and peek at its src attribute (this isn't documented so don't expect it to always be there :-):

from timeit import Timer

t = Timer('for i in range(10000): 1 + 1')
print(t.src)

This contains the little function that is essentially timed. The previous print call prints:

def inner(_it, _timer):
    pass
    _t0 = _timer()
    for _i in _it:
        for i in range(10000): 1 + 1
    _t1 = _timer()
    return _t1 - _t0

So, in effect, by using timeit you have altered the way the look-up for i is performed, since it is inside a function it's also done with STORE_FAST. Easy pitfall!

^{^{(and if you don't believe me, see dis.dis(compile(t.src, '', 'exec').co_consts[0]))}}