Search code examples
pythonslicepython-itertools

itertools.islice implementation -- efficiently slicing a list


Earlier, I was trying to answer a question where I wanted to iterate over a list slice as efficiently as possible.

for x in lst[idx1:]:

isn't ideal as it creates a copy (In general, this is O(n)). My next thought was to use itertools.islice. But if you look at the documentation, it appears that islice will call next until it finds the index it is looking for at which point it will start to yield values. This is also O(n). It seems that there is an optimization that is available here if the object passed to islice is a list or a tuple -- It seems that you could iterate over the "slice" directly (in C) without actually making a copy. I was curious if this optimization is in the source, But I didn't find anything. I'm not extremely familiar with C and the python source tree, so it's entirely possible that I missed it.

My question is this:

Is there a way to iterate over a list "slice" without making a copy of the list slice and without burning through a bunch of unwanted elements (in an optimized C implementation)?

I'm well aware that I could write my own generator for this (very naively, not accounting for the fact that many of the arguments should be optional, etc.):

def myslice(obj,start,stop,stride):
    for i in xrange(start,stop,stride):
        yield obj[i]

but that's definitely not going to beat an optimized C implementation.


If you're wondering why I would need this over just looping over a slice directly, consider the difference between:

takewhile(lambda x: x == 5, lst[idx:])  #copy's the tail of the list unnecessarily

and

takewhile(lambda x: x == 5, islice(lst,idx,None)) #inspects the head of the list unnecessarily 

and finally:

takewhile(lambda x: x == 5, magic_slice(lst,idx,None)) #How to create magic_slice???

Solution

  • Is there a way to iterate over a list "slice" without making a copy of the list slice and without burning through a bunch of unwanted elements (in an optimized C implementation)?

    Yes there is, if you write that C implementation. Cython makes this particularly easy.

    cdef class ListSlice(object):
        cdef object seq
        cdef Py_ssize_t start, end
    
        def __init__(self, seq, Py_ssize_t start, Py_ssize_t end):
            self.seq = seq
            self.start = start
            self.end = end
    
        def __iter__(self):
            return self
    
        def __next__(self):
            if self.start == self.end:
                raise StopIteration()
            r = self.seq[self.start]
            self.start += 1
            return r