Search code examples
pythonpython-2.7performancerangeoverhead

`in range` construction in python 2 --- working too slow


I wanted to check if a given x lies in the interval [0,a-1]. As a lazy coder I wrote

x in range(a)

and (because that piece of code was in 4.5 nested loops) quickly run into performance issues. I tested it and indeed, it turns out runtime of n in range(n) lies in O(n), give or take. I actually thought my code would be optimized to x >= 0 and x < a but it seems this is not the case. Even if I fix the range(a) in advance the time doesn't become constant (though it improves a lot) - see side notes.

So, my question is:

Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?


Side notes:

  1. I tried searching SO for range, python-2.7, performance tags put together, found nothing (same with python-2.x).
  2. If I try following:

    i = range(a)
    ...
    x in i
    

    so that the range is fixed and I only measure runtime of x in i, I still get runtime in O(x) (assuming a is large enough).

  3. runtime of n in xrange(n) lies in O(n) as well.
  4. I found this post, which asks similar question for python 3. I decided to test same stuff on python 3 and it rolled through tests like it's nothing. I got sad for python 2.

Solution

  • The problem with range in Python 2 is that it creates a list of values, so x in range(a) will create a list and linearly scan that list. xrange should be a generator, but it is not much faster; probably still just linearly scans the values, just without creating the entire list first.

    In [2]: %timeit 5*10**5 in range(10**6 + 1)  # Python 2
    10 loops, best of 3: 18.1 ms per loop
    
    In [3]: %timeit 5*10**5 in xrange(10**6 + 1) # Python 2
    100 loops, best of 3: 6.21 ms per loop
    

    In Python 3, range is much smarter, not only not creating the entire list, but also providing a fast implementation of the contains check.

    In [1]: %timeit 5*10**5 in range(10**6 + 1)  # Python 3
    1000000 loops, best of 3: 324 ns per loop
    

    Even faster and IMHO more readable: Using comparison chaining:

    In [2]: %timeit 0 <= 5*10**5 < 10**6 + 1     # Python 2 or 3
    10000000 loops, best of 3: 46.6 ns per loop
    

    Should I use x >= 0 and x < a and never write x in range(a) ever again? Is there an even better way of writing it?

    "No", "it depends", and "yes". You should not use x >= 0 and x < a because 0 <= x < a is shorter and easier to parse (for puny humans), and is interpreted as (0 <= x) and (x < a). And you should not use in range in Python 2, but in Python 3, you can use it if you like.

    Still, I'd prefer comparison chaining, since a <= x < b is much more explicit about bounds than x in range(a, b) (what if x == b?), which could prevent many off-by-one errors or +1 padding the range.

    Also, note that 0 <= x < a is not strictly the same as x in range(0, a), as a range will only ever contain integer values, i.e. 1.5 in range(0, 5) is False, whereas 0 <= 1.5 < 5 is True, which may of may not what you want. Also, using range you can use steps other than 1, e.g. 5 in range(4, 10, 2) is False, but the same can also be implemented using pure math, e.g. as (4 <= x < 10) and (x - 4 % 2 == 0).