Search code examples
pythonpython-3.xlistlist-comprehensiongenerator-expression

Why is updating a list faster when using a list comprehension as opposed to a generator expression?


According to this answer lists perform better than generators in a number of cases, for example when used together with str.join (since the algorithm needs to pass over the data twice).

In the following example using a list comprehension seems to yield better performance than using a corresponding generator expression though intuitively the list comprehension comes with an overhead of allocating and copying to additional memory which the generator sidesteps.

In [1]: l = list(range(2_000_000))

In [2]: %timeit l[:] = [i*3 for i in range(len(l))]
190 ms ± 4.65 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit l[:] = (i*3 for i in range(len(l)))
261 ms ± 7.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit l[::2] = [i*3 for i in range(len(l)//2)]
97.1 ms ± 2.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [5]: %timeit l[::2] = (i*3 for i in range(len(l)//2))
129 ms ± 2.21 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [6]: %timeit l[:len(l)//2] = [i*3 for i in range(len(l)//2)]
92.6 ms ± 2.34 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [7]: %timeit l[:len(l)//2] = (i*3 for i in range(len(l)//2))
118 ms ± 2.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Why does a list comprehension yield better performance in these cases?


Solution

  • This answer concerns CPython implementation only. Using a list comprehension is faster, since the generator is first converted into a list anyway. This is done because the length of the sequence should be determined before proceeding to replace data, and a generator can't tell you its length.

    For list slice assignment, this operation is handled by the amusingly named list_ass_slice. There is a special-case handling for assigning a list or tuple, here - they can use PySequence_Fast ops.

    This is the v3.7.4 implementation of PySequence_Fast, where you can clearly see a type-check for list or tuples:

    PyObject *
    PySequence_Fast(PyObject *v, const char *m)
    {
        PyObject *it;
    
        if (v == NULL) {
            return null_error();
        }
    
        if (PyList_CheckExact(v) || PyTuple_CheckExact(v)) {
            Py_INCREF(v);
            return v;
        }
    
        it = PyObject_GetIter(v);
        if (it == NULL) {
            if (PyErr_ExceptionMatches(PyExc_TypeError))
                PyErr_SetString(PyExc_TypeError, m);
            return NULL;
        }
    
        v = PySequence_List(it);
        Py_DECREF(it);
    
        return v;
    }
    

    A generator expression will fail this type check and continue to the fallback code, where it is converted into a list object, so that the length can be predetermined.

    In the general case, a predetermined length is desirable in order to allow efficient allocation of list storage, and also to provide useful error messages with extended slice assignment:

    >>> vals = (x for x in 'abc')
    >>> L = [1,2,3]
    >>> L[::2] = vals  # attempt assigning 3 values into 2 positions
    ---------------------------------------------------------------------------
                                              Traceback (most recent call last)
    ...
    ValueError: attempt to assign sequence of size 3 to extended slice of size 2
    >>> L  # data unchanged
    [1, 2, 3]
    >>> list(vals)  # generator was fully consumed
    []