Search code examples
pythonpython-itertoolscross-product

What should itertools.product() yield when supplied an empty list?


I guess it's an academic question, but the second result does not make sense to me. Shouldn't it be as thoroughly empty as the first? What is the rationale for this behavior?

from itertools import product

one_empty = [ [1,2], [] ]
all_empty = []

print [ t for t in product(*one_empty) ]  # []
print [ t for t in product(*all_empty) ]  # [()]

Updates

Thanks for all of the answers -- very informative.

Wikipedia's discussion of the Nullary Cartesian Product provides a definitive statement:

The Cartesian product of no sets ... is the singleton set containing the empty tuple.

And here is some code you can use to work through the insightful answer from sth:

from itertools import product

def tproduct(*xss):
    return ( sum(rs, ()) for rs in product(*xss) )

def tup(x):
    return (x,)

xs = [ [1, 2],     [3, 4, 5]       ]
ys = [ ['a', 'b'], ['c', 'd', 'e'] ]

txs = [ map(tup, x) for x in xs ]  # [[(1,), (2,)], [(3,), (4,), (5,)]]
tys = [ map(tup, y) for y in ys ]  # [[('a',), ('b',)], [('c',), ('d',), ('e',)]]

a = [ p for p in tproduct( *(txs + tys) )                   ]
b = [ p for p in tproduct( tproduct(*txs), tproduct(*tys) ) ]

assert a == b

Solution

  • From a mathematical point of view the product over no elements should yield the neutral element of the operation product, whatever that is.

    For example on integers the neutral element of multiplication is 1, since 1 ⋅ a = a for all integers a. So an empty product of integers should be 1. When implementing a python function that returns the product of a list of numbers, this happens naturally:

    def iproduct(lst):
      result = 1
      for i in lst:
        result *= i
      return result
    

    For the correct result to be calculated with this algorithm, result needs to be initialized with 1. This leads to a return value of 1 when the function is called on an empty list.

    This return value is also very reasonable for the purpose of the function. With a good product function it shouldn't matter if you first concat two lists and then build the product of the elements, or if you first build the product of both individual lists and then multiply the results:

    iproduct(xs + ys) == iproduct(xs) * iproduct(ys)
    

    If xs or ys is empty that only works if iproduct([]) == 1.

    Now the more complicated product() on iterators. Here also, from a mathematical point of view, product([]) should return the neutral element of that operation, whatever that is. It is not [] since product([], xs) == [], while for the neutral elements product([], xs) == xs should hold. It turns out, though, that [()] also isn't a neutral element:

    >>> list(product([()], [1,2,3]))
    [((), 1), ((), 2), ((), 3)]
    

    In fact, product() is not really a very nice mathematical product at all, since this above equation doesn't hold:

    product(*(xs + ys)) != product(product(*xs), product(*ys))
    

    Each application of product generates an additional layer of tuples and there is no way around that, so there can't even be a real neutral element. [()] comes pretty close though, it doesn't add or remove any elements, it just adds an empty tuple to each.

    [()]would in fact be the neutral element of this slightly adapted product function that only operates on lists of tuples, but doesn't add additional tuple layers on each application:

    def tproduct(*xss):
      # the parameters have to be lists of tuples
      return (sum(rs, ()) for rs in product(*xss))
    

    For this function the above product equation holds:

    def tup(x): return (x,)
    txs = [map(tup, x) for x in xs]
    tys = [map(tup, y) for y in ys]
    tproduct(*(txs + tys)) == tproduct(tproduct(*txs), tproduct(*tys))
    

    With the additional preprocessing step of packing the input lists into tuples, tproduct() gives the same result as product(), but behaves nicer from a mathematical point of view. Also its neutral element is [()],

    So [()] makes some sense as the neutral element of this kind of list multiplication. Even if it doesn't exactly fit product() it is a good choice for this function since it for example allows to define tproduct() without the need to introduce a special case for empty input.