python performance set processing-efficiency

key in s vs key not in s speed

I have this code:

s = set([5,6,7,8])

if key in s:
    return True

if key not in s:
    return False

It seems to me that it shouldn't, in theory, differ time wise, but I may be missing something under the hood.

Is there any reason to prefer one over the other in terms of processing time or readability?

Perhaps is this an example of:

"Premature optimization is the root of all evil"?

Short Answer: No, no difference. Yes, probably premature optimization.

OK, I ran this test:

import random
s = set([5,6,7,8])
for _ in range(5000000):
    s.add(random.randint(-100000,100000000))

def test_in():
    count = 0
    for _ in range(50000):
        if random.randint(-100000,100000000) in s:
            count += 1
    print(count)

def test_not_in():
    count = 0
    for _ in range(50000):
        if random.randint(-100000,100000000) not in s:
            count += 1
    print(count)

When I time the outputs:

%timeit test_in()
10 loops, best of 3: 83.4 ms per loop

%timeit test_not_in()
10 loops, best of 3: 78.7 ms per loop

BUT, that small difference seems to be a symptom of counting the components. There are an average of 47500 "not ins" but only 2500 "ins". If I change both tests to pass, e.g.:

def test_in():
    for _ in range(50000):
        if random.randint(-100000,100000000) in s:
            pass

The results are nearly identical

%timeit test_in()
10 loops, best of 3: 77.4 ms per loop

%timeit test_not_in()
10 loops, best of 3: 78.7 ms per loop

In this case, my intuition failed me. I had thought that saying it is not in the set could have had added some additional processing time. When I further consider what a hashmap does, it seems obvious that this can't be the case.

Solution

Running a simple performance test in an ipython session with timeit confirms g.d.d.c's statement.

def one(k, s):
    if k in s:
        return True     

def two(k, s):
    if k not in s:
        return False     

s = set(range(1, 100))

%timeit -r7 -n 10000000 one(50, s)
## 83.7 ns ± 0.874 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

%timeit -r7 -n 10000000 two(50, s)
## 86.1 ns ± 1.11 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Optimisations such as this aren't going to gain you a lot, and as has been pointed out in the comments will in fact reduce the speed at which you'll push out bugfixes/improvements/... due to bad readability. For this type of low-level performance gains, I'd suggest looking into Cython or Numba.