Since the range object produces values on demand, does it mean that anytime a range is indexed, the iteration protocol is invoked up to that index?
What I mean is when:
>>> R = range(1,11)
>>> print(R[5])
6
Since R[5]
isn't stored in memory, is it calculated every time by creating a new iterator? If not, how is it possible to index a range object?
No iterator is created here, and no iteration occurs. The range
object is implemented such that Python computes the value of R[5]
on-demand in constant time.1
If the index i
is not negative, the computation boils down to:
i * step + start
So in the case of your code R[5]
, this would be 5*1 + 1
which is 6.
If the index i
is negative, the length of R
is added to i
first, and then the value is computed as before:
(i + len(R)) * step + start
When you write R[5]
, this Python syntax is eventually translated into a call to PyObject_GetItem
, which inspects the object R
to see how it should proceed to find the item at index 5.
PyObject_GetItem
first checks the tp_as_mapping
slot of the range
type. This is not null; it holds a reference to a struct called range_as_mapping
. PyObject_GetItem
then checks to see what is in the mp_subscript
field of this struct:
static PyMappingMethods range_as_mapping = {
(lenfunc)range_length, /* mp_length */
(binaryfunc)range_subscript, /* mp_subscript */
(objobjargproc)0, /* mp_ass_subscript */
};
As you can see in the snippet above, it finds the range_subscript
function occupying the mp_subscript
field.2
Now range_subscript
inspects the arguments it was passed (R
and 5
) to decide if a single index or a slice was requested. The integer 5
means that just a single index is needed and so the function delegates the computation of the value to compute_range_item
. This function performs the computation to return the integer 6 as outlined in the first part of this answer.
1 I have assumed you are using CPython: other Python implementations may implement the range
object differently.
2 If you were to call len(R)
, you can see the internal function in mp_length
that is called to compute the length of R
(cf. Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3?).