Search code examples
pythonargsiterable-unpacking

Explanation on unpacking a list with itertools.product


I cant seem to wrap my head around how unpacking (*) along with itertools.product() is working with the example below.

for x in product(["ghi","abc"]):
    print(x)

output:

('ghi',)
('abc',)

And using *

for x in product(*["ghi","abc"]):
    print(x)

output:

('g', 'a')
('g', 'b')
('g', 'c')
('h', 'a')
('h', 'b')
('h', 'c')
('i', 'a')
('i', 'b')
('i', 'c')

How was this output generated? I know product() usually generates the combinations given the number of 'repeat'. As for above, repeat is default to 1. How come a ('a','g') wasnt generated?

I guess what did *["ghi","abc"] actually produce to the product() function? I mean I see what the outcome is but I just cant seem to get how it worked.


Solution

  • When you unpack the arguments to product it would be the same as manual entering each element from the list as a argument. That is:

    product(*[a, b]) == product(a, b)

    So what your really doing is passing each string in the list into product(). Python is simply doing this behind the scenes for you.

    The reason ('a', 'g') didn't appear in the results is simply because of how product works. The official documentation of product does a good job of explaining how exactly it works:

    Cartesian product of input iterables.

    Roughly equivalent to nested for-loops in a generator expression. For example, product(A, B) returns the same as ((x,y) for x in A for y in B).

    The nested loops cycle like an odometer with the rightmost element advancing on every iteration. This pattern creates a lexicographic ordering so that if the input’s iterables are sorted, the product tuples are emitted in sorted order.

    To compute the product of an iterable with itself, specify the number of repetitions with the optional repeat keyword argument. For example, product(A, repeat=4) means the same as product(A, A, A, A).

    This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:

    def product(*args, repeat=1):
        # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
        # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
        pools = [tuple(pool) for pool in args] * repeat
        result = [[]]
        for pool in pools:
            result = [x+[y] for x in result for y in pool]
        for prod in result:
            yield tuple(prod)