Search code examples
pythonpython-internalsshort-circuiting

Python In Operator - Short Circuit


I was reading an interesting post on Short-Circuiting in Python and wondered if this was true for the in operator. My simple testing would conclude that it does not:

%%timeit -n 1000
0 in list(range(10))
1000 loops, best of 3: 639 ns per loop

%%timeit -n 1000
0 in list(range(1000))
1000 loops, best of 3: 23.7 µs per loop
# larger the list, the longer it takes. however, i do notice that a higher 
# value does take longer.

%%timeit -n 1000
999 in list(range(1000))
1000 loops, best of 3: 45.1 µs per loop

Is there a detailed explanation of why 999 takes longer than 0. Is the in operator like a loop?

Also, is there a way to tell the in operator to "stop the loop" once the value is found (or is this the already defaulted behavior that I'm not seeing)?

Lastly- Is there another operator/function that I am skipping over that does what I'm talking about in regards to "short-circuiting" in?


Solution

  • The implementation of in for list objects is found in list_contains. It performs a scan of the list and does exit early if the last comparison has found the element, there's no point in continuing there.

    The loop involved is:

    for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
        cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
                                           Py_EQ);
    

    If cmp is 1 (the value returned from PyObject_RichCompareBool for a match), the for loop condition (cmp == 0 && i < Py_SIZE(a)) becomes false and terminates.

    For list objects, which are built-in, what is called for in is a C function (for CPython). For other implementations of Python, this can be a different language using different language constructs.

    For user-defined classes in Python, what is called is defined in the Membership test operations of the Reference Manual, take a look there for a run-down of what gets called.


    You could also come to this conclusion by timing:

    l = [*range(1000)]    
    %timeit 1 in l
    85.8 ns ± 11.9 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
    %timeit 999 in l
    22 µs ± 221 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    

    The furthest the element the more you need to scan. If it didn't short-circuit, all in operations would result in similar timings.