Search code examples
pythonpython-itertools

Conditionals and itertools.groupby issue


I am using groupby to parse a list of words and organize them into lists by their length. For example:

from itertools import groupby

words = ['this', 'that', 'them', 'who', 'what', 'where', 'whyfore']

for key, group in groupby(sorted(words, key = len), len):
    print key, list(group)

3 ['who']
4 ['this', 'that', 'them', 'what']
5 ['where']
7 ['whyfore']

Getting the lengths of the lists works as well:

for key, group in groupby(sorted(words, key = len), len):
    print len(list(group))

1
4
1
1

The issue that if I put a conditional before beforehand like this, this is the result:

for key, group in groupby(sorted(words, key = len), len):
    if len(list(group)) > 1:
        print list(group)

Output:

[]

Why is this?


Solution

  • Each group is an iterable, and turning that into a list exhausts it. You cannot turn an iterable into a list twice.

    Store the list as a new variable:

    for key, group in groupby(sorted(words, key = len), len):
        grouplist = list(group)
        if len(grouplist) > 1:
            print grouplist
    

    Now you consume the iterable only once:

    >>> for key, group in groupby(sorted(words, key = len), len):
    ...     grouplist = list(group)
    ...     if len(grouplist) > 1:
    ...         print grouplist
    ... 
    ['this', 'that', 'them', 'what']