How can I iterate over groupby
results in pairs? What I tried isn't quite working:
from itertools import groupby,izip
groups = groupby([(1,2,3),(1,2),(1,2),(3,4,5),(3,4)],key=len)
def grouped(iterable, n):
return izip(*[iterable]*n)
for g, gg in grouped(groups,2):
print list(g[1]), list(gg[1])
Output I get:
[] [(1, 2), (1, 2)]
[] [(3, 4)]
Output I would like to have:
[(1, 2, 3)] [(1, 2), (1, 2)]
[(3, 4, 5)] [(3, 4)]
import itertools as IT
groups = IT.groupby([(1,2,3),(1,2),(1,2),(3,4,5),(3,4)], key=len)
groups = (list(group) for key, group in groups)
def grouped(iterable, n):
return IT.izip(*[iterable]*n)
for p1, p2 in grouped(groups, 2):
print p1, p2
yields
[(1, 2, 3)] [(1, 2), (1, 2)]
[(3, 4, 5)] [(3, 4)]
The code you posted is very interesting. It has a mundane problem, and a subtle problem.
The mundane problem is that itertools.groupby returns an iterator which outputs both a key and a group on each iteration. Since you are interested in only the groups, not the keys, you need something like
groups = (group for key, group in groups)
The subtle problem is more difficult to explain -- I'm not really sure I understand it fully. Here is my guess: The iterator returned by groupby
has turned its input,
[(1,2,3),(1,2),(1,2),(3,4,5),(3,4)]
into an iterator. That the groupby iterator is wrapped around the underlying data iterator is analogous to how a csv.reader
is wrapped around an underlying file object iterator. You get one pass through this iterator and one pass only. The itertools.izip function, in the process of pairing items in groups
, causes the groups
iterator to advance from the first item to the second. Since you only get one pass through the iterator, the first item has been consumed, so when you call list(g[1])
it is empty.
A not-so-satisfying fix to this problem is to convert the iterators in groups
into lists:
groups = (list(group) for key, group in groups)
so itertools.izip
will not prematurely consume them. Edit: On second thought, this fix is not so bad. groups
remains an iterator, and only turns the group
into a list as it is consumed.