Search code examples
pythonpython-itertoolscartesian-product

How to get Cartesian product of two iterables when one of them is infinite


Let's say I have two iterables, one finite and one infinite:

import itertools

teams = ['A', 'B', 'C']
steps = itertools.count(0, 100)

I was wondering if I can avoid the nested for loop and use one of the infinite iterators from the itertools module like cycle or repeat to get the Cartesian product of these iterables.

The loop should be infinite because the stop value for steps is unknown upfront.

Expected output:

$ python3 test.py  
A 0
B 0
C 0
A 100
B 100
C 100
A 200
B 200
C 200
etc...

Working code with nested loops:

from itertools import count, cycle, repeat

STEP = 100 
LIMIT = 500
TEAMS = ['A', 'B', 'C']


def test01():
    for step in count(0, STEP):
        for team in TEAMS:
            print(team, step)
        if step >= LIMIT:  # Limit for testing
            break

test01()

Solution

  • Try itertools.product

    from itertools import product
    for i, j in product(range(0, 501, 100), 'ABC'):
        print(j, i)
    

    As the docs say product(A, B) is equivalent to ((x,y) for x in A for y in B). As you can see, product yield a tuple, which mean it's a generator and do not create a list in memory in order to work properly.

    This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:

    def product(*args, **kwds):
        # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
        # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
        pools = map(tuple, args) * kwds.get('repeat', 1)
        result = [[]]
        for pool in pools:
            result = [x+[y] for x in result for y in pool]
        for prod in result:
            yield tuple(prod)
    

    But you can't use itertools.product for infinite loop due to a known issue:

    According to the documentation, itertools.product is equivalent to nested for-loops in a generator expression. But, itertools.product(itertools.count(2010)) is not.

    >>> import itertools
    >>> (year for year in itertools.count(2010))
    <generator object <genexpr> at 0x026367D8>
    >>> itertools.product(itertools.count(2010))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    MemoryError
    

    The input to itertools.product must be a finite sequence of finite iterables.

    For infinite loop, you can use this code.