Search code examples
pandasdataframematrix-multiplication

DataFrame's as matrices in matrix product


Suppose I have the following two DataFrame's in Pandas.

import Pandas as pd

data1 = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
df1 = pd.DataFrame(data1)

data2 = {'col3': [2,3]}
df2 = pd.DataFrame(data2)

I would like to treat the above 2 DataFrame as one $3\times 2$ matrix and one $2\times 1$ matrix and get the matrix product of data1*data2. I tried using

df = df1.dot(df2)

But I got an error message

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [244], in <cell line: 1>()
----> 1 vol_df.dot(weight_df['basket_weight'].T)

File /shared/mamba/micromamba/envs/xr-sw-2209/lib/python3.9/site-packages/pandas/core/frame.py:1507, in DataFrame.dot(self, other)
   1505 common = self.columns.union(other.index)
   1506 if len(common) > len(self.columns) or len(common) > len(other.index):
-> 1507     raise ValueError("matrices are not aligned")
   1509 left = self.reindex(columns=common, copy=False)
   1510 right = other.reindex(index=common, copy=False)

ValueError: matrices are not aligned

How do I fix the error?


Solution

  • Your indices are not aligned (you would need to have col1/col2 as indices of df2).

    data1 = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
    df1 = pd.DataFrame(data1)
    
    data2 = {'col3': [2,3]}
    df2 = pd.DataFrame(data2, index=['col1', 'col2'])
    
    df1.dot(df2)
    
       col3
    0    14
    1    19
    2    24
    

    Or, manually align:

    df1.dot(df2.set_axis(df1.columns))
    

    Another option is to use the underlying numpy array to bypass alignment:

    df1.dot(df2.to_numpy())
    

    Output:

        0
    0  14
    1  19
    2  24
    

    If you want a multiplication:

    df1.mul(df2['col3'], axis=1)
    
       col1  col2
    0     2    12
    1     4    15
    2     6    18