Search code examples
pythonfunctional-programmingpython-itertools

Generating iterables of iterables with python itertools. (using the repeat function)


While experimenting with functional programming in python i have noticed a difference between two expression I believe should have the same results.

In particular what I want to to is to have an iterable which consists of(or should I say yields?) other iterable's. A simple example of what I want to do could be:

import itertools as itr
itr.repeat(itr.repeat(1,5),3)

That is an iterable consisting of 3 iterables, which themselves consits of 5 occourences of the number 1. This is however not what happens. What i get instead(translated to lists) is:

[[1,1,1,1,1],[],[]]

That is, the innermost iterable is not copied(it seems) instead the same iterable is used again and again, resulting in it running out of elements.

A version of this that does works using maps is:

import itertools as itr
map(lambda x: itr.repeat(1,5), range(3))

This produces the result I expect:

[[1,1,1,1,1],[1,1,1,1,1],[1,1,1,1,1]]

I don't understand why this works, while the method using only repeat does not. Maybe it has something to do with the fact that in the map version, the iterable coming from repeat is wrapped in a lambda, but should that make a difference? As far as I see it, the only difference between lambda x: itr.repeat(1,5) and itr.repeat(1,5) is that the first one takes an argument (which it then throws away) while the other one does not.


Solution

  • The difference is that itertools.repeat takes an object as its first argument, and when iterated it yields that same object multiple times. In this case, that object can only be iterated once before it is exhausted, hence the result you see.

    map takes a callable object as its first argument, and it calls that object multiple times, each time yielding the result.

    So, in your first code snippet there is only ever one object generating 1 5 times. In your second snippet, there is one lambda object, but each time it's called it creates a new generator object generating 1 5 times.

    To get what you want you would normally write either:

    (itr.repeat(1,5) for _ in range(3))
    

    to get multiple 1 5 times generators, or:

    itr.repeat(tuple(itr.repeat(1,5)),3)
    

    since a tuple, unlike the return from itr.repeat, can be iterated repeatedly.

    Or of course, since this example is small you could forget about generators and just write:

    ((1,)*5,)*3
    

    which is concise but a bit obscure.

    Your problem is similar to the difference between the following:

    # there is only one inner list
    foo = [[]] * 3
    foo[0].append(0)
    foo
    # [[0], [0], [0]]
    
    # there are three separate inner lists
    bar = [[] for _ in range(3)]
    bar[0].append(0)
    bar
    # [[0], [], []]