Search code examples
pythonsetset-comprehension

python set comprehension for 2.6


I was trying set comprehension for 2.6, and came across the following two ways. I thought the first method would be faster than the second, timeit suggested otherwise. Why is the second method faster even though the second method has got an extra list instantiation followed by a set instantiation?

Method 1:

In [16]: %timeit set(node[0] for node in pwnodes if node[1].get('pm'))
1000000 loops, best of 3: 568 ns per loop

Method 2:

In [17]: %timeit set([node[0] for node in pwnodes if node[1].get('pm')]) 
1000000 loops, best of 3: 469 ns per loop

where pwnodes = [('e1', dict(pm=1, wired=1)), ('e2', dict(pm=1, wired=1))].


Solution

  • Iteration is simply faster when using a list comprehension:

    In [23]: from collections import deque
    
    In [24]: %timeit deque((node[0] for node in pwnodes if node[1].get('pm')), maxlen=0)
    1000 loops, best of 3: 305 µs per loop
    
    In [25]: %timeit deque([node[0] for node in pwnodes if node[1].get('pm')], maxlen=0)
    1000 loops, best of 3: 246 µs per loop
    

    The deque is used to illustrate iteration speed; a deque with maxlen set to 0 discards all elements taken from the iterable so there are no memory allocation differences to skew the results.

    That's because in Python 2, list comprehensions don't use a separate namespace, while a generator expression does (it has to, by necessity). That extra namespace requires a new frame on the stack, and this is expensive. The major advantage of generator expressions is their low memory footprint, not their speed.

    In Python 3, list comprehensions have a separate namespace as well, and list comprehension and generator iteration speed is comparable. You also have set comprehensions, which are fastest still, even on Python 2.