Search code examples
pythonpython-3.xpython-itertools

Most efficient way to convert/unpack an itertools.chain object to an unordered and ordered list


Other than using the list and sorted methods to convert an itertools.chain object to get an unordered and ordered list, respectively, are there more efficient ways of doing the same in python3? I read in this answer that list is for debugging. Is this true?

Below is an example code where I time the processes:

from itertools import chain
from time import time

def foo(n):
        for i in range(n):
            yield range(n)

def check(n):
    # check list method
    start = time()
    a = list(itertools.chain.from_iterable(foo(n)))
    end = time()- start
    print('Time for list = ', end)
    # check sorted method
    start = time()
    b = sorted(itertools.chain.from_iterable(foo(n)))
    end = time()- start
    print('Time for sorted = ', end)

Results:

>>> check(1000)
Time for list =  0.04650092124938965
Time for sorted =  0.08582258224487305
>>> check(10000)
Time for list =  1.615750789642334
Time for sorted =  8.84056806564331
>>>

Solution

  • Other than using the list and sorted methods to convert an itertools.chain object to get an unordered and ordered list, respectively, are there more efficient ways of doing the same in python3?

    simple answer: no. When working with the python generators and iterators, the only caveat you have to avoid is to convert a generator into a list, then into a generator, then into a list again etc…

    i.e. a chain like that would be stupid:

    list(sorted(list(filter(list(map(…
    

    because you would then lose all the added value of the generators.

    I read in this answer that list is for debugging. Is this true?

    it depends on your context, generally speaking a list() is not for debugging, it's a different way to represent an iterable.

    You might want to use a list() if you need to access an element at a given index, or if you want to know the length of the dataset. You'll want to not use a list() if you can consume the data as it goes.

    Think of all the generators/iterator scheme as a way to apply an algorithm for each item as they are available, whereas you work on lists as bulk.

    About the question you quote, the question is very specific, and it asks how it can introspect a generator from the REPL, in order to know what is inside. And the advice from the person who answered this is to use the list(chain) only for introspection, but keep as it was originally.