Search code examples
pythonpython-itertools

Itertools saturation after 1 billion permutation


I wrote the following code to calculate the mean of the sums obtained between all the permutations of a list of lists:

import numpy as np
import itertools

r = 13
c = 5
a = np.arange(r*c).reshape(r, c)
a = list(itertools.product(*a))

res = sum([sum(e) for e in a])/len(a)

The code crashes as r seems slightly too big and itertools/python can not handle it. Is there any other ways to calculate res without having the code crashing?


Solution

  • You are needlessly collecting first the products, and then the individual sums in lists when you can just as well just iterate the iterators. You also don't need len of the list, since you can compute the number of products directly.

    res = sum(sum(e) for e in itertools.product(*a))/c**r
    

    This will consume much less memory, which might keep you computer from freezing or crashing. However, for r=13 and c=5, this still means testing c**r = 1,220,703,125 combinations, which is probably too much for Python.

    However, since you are getting all products, each element will appear in all products the same number of times, so you do not have to actually calculate and iterate the products at all. Instead, you can calculate the average sum of the products directly like this:

    res = sum(sum(a)) // c   # here, a is the numpy array, not the product iterator
    

    (This is for all lists having the same number of elements; if the lists have different sizes, the formula will be a bit more complicated, but can still be calculated directly without any loops.)