Search code examples
pythonpython-itertools

Breadth-first version of itertools.chain()


In itertools there's chain, which combines multiple generators in a single one, and in essence does a depth-first iteration over them, i.e., chain.from_iterable(['ABC', '123']) yields A, B, C, 1, 2, 3. But, there's no breadth-first version, or am I missing something? There's of course izip_longest, but for large numbers of generators this feels awkward, as the tuples will be very long and possibly very sparse.

I came up with the following:

def chain_bfs(*generators):
    generators = list(generators)
    while generators:
        g = generators.pop(0)
        try:
            yield g.next()
        except StopIteration:
            pass
        else:
            generators.append(g)

It feels a bit verbose to me, is there a more Pythonic approach I'm missing? And would this function be a good candidate for inclusion in itertools?


Solution

  • You could use collections.deque() to rotate through your iterators; rotating a deque is much more efficient. I'd also call it a chained zip, not a 'breath first chain', as such:

    from collections import deque
    
    def chained_zip(*iterables):
        iterables = deque(map(iter, iterables))
        while iterables:
            try:
                yield next(iterables[0])
            except StopIteration:
                iterables.popleft()
            else:
                iterables.rotate(-1)
    

    Demo:

    >>> list(chained_zip('ABC', '123'))
    ['A', '1', 'B', '2', 'C', '3']
    >>> list(chained_zip('AB', '1234'))
    ['A', '1', 'B', '2', '3', '4']
    

    There is also a roundrobin() recipe in the documentation that does the same, using the itertools.cycle() function:

    def roundrobin(*iterables):
        "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
        # Recipe credited to George Sakkis
        pending = len(iterables)
        nexts = cycle(iter(it).__next__ for it in iterables)
        while pending:
            try:
                for next in nexts:
                    yield next()
            except StopIteration:
                pending -= 1
                nexts = cycle(islice(nexts, pending))