What is the time complexity of iterating, or more precisely each iteration through a deque from the collections library in Python?
An example is this:
elements = deque([1,2,3,4])
for element in elements:
print(element)
Is each iteration a constant O(1) operation? or does it do a linear O(n) operation to get to the element in each iteration?
There are many resources online for time complexity with all of the other deque methods like appendleft
, append
, popleft
, pop
. There doesn't seem to be any time complexity information about the iteration of a deque.
Thanks!
If your construction is something like:
elements = deque([1,2,3,4])
for i in range(len(elements)):
print(elements[i])
You are not iterating over the deque
, you are iterating over the range
object, and then indexing into the deque
. This makes the iteration polynomial time, since each indexing operation, elements[i]
is O(n). However, actually iterating over the deque
is linear time.
for x in elements:
print(x)
Here's a quick, empirical test:
import timeit
import pandas as pd
from collections import deque
def build_deque(n):
return deque(range(n))
def iter_index(d):
for i in range(len(d)):
d[i]
def iter_it(d):
for x in d:
x
r = range(100, 10001, 100)
index_runs = [timeit.timeit('iter_index(d)', 'from __main__ import build_deque, iter_index, iter_it; d = build_deque({})'.format(n), number=1000) for n in r]
it_runs = [timeit.timeit('iter_it(d)', 'from __main__ import build_deque, iter_index, iter_it; d = build_deque({})'.format(n), number=1000) for n in r]
df = pd.DataFrame({'index':index_runs, 'iter':it_runs}, index=r)
df.plot()
Now, we can actually see how the iterator protocol is implemented for deque
objects in CPython source code:
First, the deque
object itself:
typedef struct BLOCK {
struct BLOCK *leftlink;
PyObject *data[BLOCKLEN];
struct BLOCK *rightlink;
} block;
typedef struct {
PyObject_VAR_HEAD
block *leftblock;
block *rightblock;
Py_ssize_t leftindex; /* 0 <= leftindex < BLOCKLEN */
Py_ssize_t rightindex; /* 0 <= rightindex < BLOCKLEN */
size_t state; /* incremented whenever the indices move */
Py_ssize_t maxlen;
PyObject *weakreflist;
} dequeobject;
So, as stated in the comments, a deque
is a doubly-linked list of "block" nodes, where a block is essentially an array of python object pointers. Now for the iterator protocol:
typedef struct {
PyObject_HEAD
block *b;
Py_ssize_t index;
dequeobject *deque;
size_t state; /* state when the iterator is created */
Py_ssize_t counter; /* number of items remaining for iteration */
} dequeiterobject;
static PyTypeObject dequeiter_type;
static PyObject *
deque_iter(dequeobject *deque)
{
dequeiterobject *it;
it = PyObject_GC_New(dequeiterobject, &dequeiter_type);
if (it == NULL)
return NULL;
it->b = deque->leftblock;
it->index = deque->leftindex;
Py_INCREF(deque);
it->deque = deque;
it->state = deque->state;
it->counter = Py_SIZE(deque);
PyObject_GC_Track(it);
return (PyObject *)it;
}
// ...
static PyObject *
dequeiter_next(dequeiterobject *it)
{
PyObject *item;
if (it->deque->state != it->state) {
it->counter = 0;
PyErr_SetString(PyExc_RuntimeError,
"deque mutated during iteration");
return NULL;
}
if (it->counter == 0)
return NULL;
assert (!(it->b == it->deque->rightblock &&
it->index > it->deque->rightindex));
item = it->b->data[it->index];
it->index++;
it->counter--;
if (it->index == BLOCKLEN && it->counter > 0) {
CHECK_NOT_END(it->b->rightlink);
it->b = it->b->rightlink;
it->index = 0;
}
Py_INCREF(item);
return item;
}
As you can see, the iterator essentially keeps track of a block index, a pointer to a block, and a counter of total items in the deque. It stops iterating if the counter reaches zero, if not, it grabs the element at the current index, increments the index, decrements the counter, and tales care of checking whether to move to the next block or not. In other words, A deque could be represented as a list-of-lists in Python, e.g. d = [[1,2,3],[4,5,6]]
, and it iterates
for block in d:
for x in block:
...