I just realized something while playing with timeit
and and, or, any(), all()
that I figured I could share here. Here is the script to measure performance:
def recursion(n):
"""A slow way to return a True or a False boolean."""
return True if n == 0 else recursion(n-1)
def my_function():
"""The function where you perform all(), any(), or, and."""
a = False and recursion(10)
if __name__ == "__main__":
import timeit
setup = "from __main__ import my_function"
print(timeit.timeit("my_function()", setup=setup))
And here are some timings:
a = False and recursion(10)
0.08799480279344607
a = True or recursion(10)
0.08964192798430304
As expected, True or recursion(10)
as well as False and recursion(10)
are very fast to compute because only the first term matters and the operation returns immediately.
a = recursion(10) or True # recursion() is False
1.4154556830951606
a = recursion(10) and False # recursion() is True
1.364157978046478
Having or True
or and False
in the line does not speed up the computation here because they are evaluated second and the whole recursion has to be performed first. While annoying, it's logical and it follows operation priority rules.
What is more surprising is that all()
and any()
always have the worst performance regardless of the case:
a = all(i for i in (recursion(10), False))) # recursion() is False
1.8326778537880273
a = all(i for i in (False, recursion(10))) # recursion() is False
1.814645767348111
I would have expected the second evaluation to be much faster than the first one.
a = any(i for i in (recursion(10), True))) # recursion() is True
1.7959248761901563
a = any(i for i in (True, recursion(10))) # recursion() is True
1.7930442127481
Same unmet expectations here.
So it seems like any()
and all()
are far from being a handy way to write respectively a big or
and a big and
if performance matters in your application. Why is that?
Edit: based on the comments it seems the tuple generation is slow. I see no reason why Python itself could not use this:
def all_faster(*args):
Result = True
for arg in args:
if not Result:
return False
Result = Result and arg
return True
def any_faster(*args):
Result = False
for arg in args:
if Result:
return True
Result = Result or arg
return False
It's faster already than the built-in functions and seems to have the short-circuit mechanism.
a = faster_any(False, False, False, False, True)
0.39678611016915966
a = faster_any(True, False, False, False, False)
0.29465180389252055
a = faster_any(recursion(10), False) # recursion() is True
1.5922580174283212
a = faster_any(False, recursion(10)) # recursion() is True
1.5799157924820975
a = faster_all(False, recursion(10)) # recursion() is True
1.6116566893888375
a = faster_all(recursion(10), False) # recursion() is True
1.6004807187900951
Edit2: alright it's faster with arguments passed one by one but slower with generators.
any
and all
short-circuit all right.
The problem is that here in both cases, you have to build the tuple
prior to pass it to any
so the order doesn't make a difference: the time taken is still the same. Let's decompose this with a variable:
t = (True, recursion(10)) # recursion is called
a = any(i for i in t) # this is very fast, only boolean testing
When you're reaching the second line, the time has already been spent.
It's different with and
or or
which short circuit.
The case where any
or all
are interesting is when you're evaluating the data when you're testing:
any(recusion(x) for x in (10,20,30))
If you wanted to avoid evaluation, you could pass a tuple of lambdas (inline functions) to any
and call the functions:
now:
a = any(i() for i in (lambda:recursion(10), lambda:True)))
and:
a = any(i() for i in (lambda:True,lambda:recursion(10))))
have a very different execution time (the latter is instantaneous)