Search code examples
pythonpython-3.xindexingpython-internals

Is the iteration protocol used when indexing a range object?


Since the range object produces values on demand, does it mean that anytime a range is indexed, the iteration protocol is invoked up to that index?

What I mean is when:

>>> R = range(1,11)
>>> print(R[5])
6

Since R[5] isn't stored in memory, is it calculated every time by creating a new iterator? If not, how is it possible to index a range object?


Solution

  • No iterator is created here, and no iteration occurs. The range object is implemented such that Python computes the value of R[5] on-demand in constant time.1

    If the index i is not negative, the computation boils down to:

    i * step + start
    

    So in the case of your code R[5], this would be 5*1 + 1 which is 6.

    If the index i is negative, the length of R is added to i first, and then the value is computed as before:

    (i + len(R)) * step + start
    

    Python-internals

    When you write R[5], this Python syntax is eventually translated into a call to PyObject_GetItem, which inspects the object R to see how it should proceed to find the item at index 5.

    PyObject_GetItem first checks the tp_as_mapping slot of the range type. This is not null; it holds a reference to a struct called range_as_mapping. PyObject_GetItem then checks to see what is in the mp_subscript field of this struct:

    static PyMappingMethods range_as_mapping = {
            (lenfunc)range_length,       /* mp_length */
            (binaryfunc)range_subscript, /* mp_subscript */
            (objobjargproc)0,            /* mp_ass_subscript */
    };
    

    As you can see in the snippet above, it finds the range_subscript function occupying the mp_subscript field.2

    Now range_subscript inspects the arguments it was passed (R and 5) to decide if a single index or a slice was requested. The integer 5 means that just a single index is needed and so the function delegates the computation of the value to compute_range_item. This function performs the computation to return the integer 6 as outlined in the first part of this answer.


    1 I have assumed you are using CPython: other Python implementations may implement the range object differently.

    2 If you were to call len(R), you can see the internal function in mp_length that is called to compute the length of R (cf. Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3?).