Search code examples
pandasnumpyvectorization

Vectorized solution for multiplying each element of array with each element


I would like to multiply all items in an array with each other and to get a df with the first column being a1a1, a1a2, a1a3..., the second column a2a1, a2*a2 etc. and I'd like a vectorized solution. What I have so far is half-vectorized, I don't know how to get rid of the loop here and would appreciate any tips!

data = {'values': [0.7, 0.1, 0.5, 0.7]}
df = pd.DataFrame.from_dict(data)

products = []
for i in df.index:
    products.append([np.multiply(item,df.loc[i, 'values']) for item in df['values']])
products = pd.DataFrame(products)

Solution

  • Use broadcasting:

    a = df['values'].to_numpy()
    
    out = pd.DataFrame(a[:,None]*a)
    

    Or with numpy.outer:

    out = pd.DataFrame(np.outer(df['values'], df['values']))
    

    Output:

          0     1     2     3
    0  0.49  0.07  0.35  0.49
    1  0.07  0.01  0.05  0.07
    2  0.35  0.05  0.25  0.35
    3  0.49  0.07  0.35  0.49