Search code examples
pythoniteratorgroupingpython-itertools

list around groupby results in empty groups


I was playing around to get a better feeling for itertools groupby, so I grouped a list of tuples by the number and tried to get a list of the resulting groups. When I convert the result of groupby to a list however, I get a strange result: all but the last group are empty. Why is that? I assumed turning an iterator into a list would be less efficient but never change behavior. I guess the lists are empty because the inner iterators are traversed but when/where does that happen?

import itertools

l=list(zip([1,2,2,3,3,3],['a','b','c','d','e','f']))
#[(1, 'a'), (2, 'b'), (2, 'c'), (3, 'd'), (3, 'e'), (3, 'f')]

grouped_l = list(itertools.groupby(l, key=lambda x:x[0]))
#[(1, <itertools._grouper at ...>), (2, <itertools._grouper at ...>), (3, <itertools._grouper at ...>)]

[list(x[1]) for x in grouped_l]
[[], [], [(3, 'f')]]


grouped_i = itertools.groupby(l, key=lambda x:x[0])
#<itertools.groupby at ...>
[list(x[1]) for x in grouped_i]
[[(1, 'a')], [(2, 'b'), (2, 'c')], [(3, 'd'), (3, 'e'), (3, 'f')]]

Solution

  • From the itertools.groupby() documentation:

    The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible.

    Turning the output from groupby() into a list advances the groupby() object.


    Hence, you shouldn't be type-casting itertools.groupby object to list. If you want to store the values as list, then you should be doing something like this list comprehension in order to create copy of groupby object:

    grouped_l = [(a, list(b)) for a, b in itertools.groupby(l, key=lambda x:x[0])]
    

    This will allow you to iterate your list (transformed from groupby object) multiple times. However, if you are interested in only iterating the result once, then the second solution you mentioned in the question will suffice your requirement.