Search code examples
pythonnumpypython-itertools

itertools product speed up


I use itertools.product to generate all possible variations of 4 elements of length 13. The 4 and 13 can be arbitrary, but as it is, I get 4^13 results, which is a lot. I need the result as a Numpy array and currently do the following:

  c = it.product([1,-1,np.complex(0,1), np.complex(0,-1)], repeat=length)
  sendbuf = np.array(list(c))

With some simple profiling code shoved in between, it looks like the first line is pretty much instantaneous, whereas the conversion to a list and then Numpy array takes about 3 hours. Is there a way to make this quicker? It's probably something really obvious that I am overlooking.

Thanks!


Solution

  • The NumPy equivalent of itertools.product() is numpy.indices(), but it will only get you the product of ranges of the form 0,...,k-1:

    numpy.rollaxis(numpy.indices((2, 3, 3)), 0, 4)
    array([[[[0, 0, 0],
             [0, 0, 1],
             [0, 0, 2]],
    
            [[0, 1, 0],
             [0, 1, 1],
             [0, 1, 2]],
    
            [[0, 2, 0],
             [0, 2, 1],
             [0, 2, 2]]],
    
    
           [[[1, 0, 0],
             [1, 0, 1],
             [1, 0, 2]],
    
            [[1, 1, 0],
             [1, 1, 1],
             [1, 1, 2]],
    
            [[1, 2, 0],
             [1, 2, 1],
             [1, 2, 2]]]])
    

    For your special case, you can use

    a = numpy.indices((4,)*13)
    b = 1j ** numpy.rollaxis(a, 0, 14)
    

    (This won't run on a 32 bit system, because the array is to large. Extrapolating from the size I can test, it should run in less than a minute though.)

    EIDT: Just to mention it: the call to numpy.rollaxis() is more or less cosmetical, to get the same output as itertools.product(). If you don't care about the order of the indices, you can just omit it (but it is cheap anyway as long as you don't have any follow-up operations that would transform your array into a contiguous array.)

    EDIT2: To get the exact analogue of

    numpy.array(list(itertools.product(some_list, repeat=some_length)))
    

    you can use

    numpy.array(some_list)[numpy.rollaxis(
        numpy.indices((len(some_list),) * some_length), 0, some_length + 1)
        .reshape(-1, some_length)]
    

    This got completely unreadable -- just tell me whether I should explain it any further :)