Search code examples
python-3.xoptimizationvectorization

Pairwise product on a Python List (or numpy array)


I have a list of numbers (floats) (with the list size typically being greater than 20k) and need to find the pairwise product of each of those. For example if I have 10 numbers, I need an output of 45 numbers in a list with the output being these 45 numbers. A nested for loop in Python is extremely slow. Is there any way for me parallelize these calculations, given that the operation is simple product.

a =np.random.random(100)
c = []
i = 0
for num in range(0,len(a)):
    for num2 in range(0,len(a)):      
        if num2 > num:
            c.append(a[num]*a[num2])

Solution

  • Accessing value of Numpy array is slow (because each access does a C call with many internal checks and internal function sub-calls).

    One way to speed this up is to convert the Numpy array to a list, but this is still far from being optimal since the code is generally interpreted using CPython.

    A faster approach is to use vectorized calls so to avoid any loops and just do few Numpy calls. Here is a fully vectorized approach:

    x, y = np.meshgrid(a, a)
    product = x * y
    mask = np.triu(np.full((a.size, a.size), True), 1)
    result = product[mask]
    

    This is 18 times faster on my machine (1.33 ms VS 73 µs). Note that the mask can be precomputed once in a loop (resulting in a nearly twice faster code).

    If this is still not enough, then please consider using Cython or Numba. Note they can also help to reduce the memory footprint since there is no need to create the temporary matrices.