Search code examples
pythonpandasdataframemultiplicationsumproduct

Sumproduct over multiple DF columns


ID    height_1    area_1   in_1    height_2    area_2   in_2    height_3    area_3   in_3
1     4           20       1       7           87       1       2          21       1
2     10          83       1       12          32       1       9          41       0
3     16          78       1       12          17       0       np.nan     np.nan   np.nan

How can I calculate the sumproduct in a generic approch for each row looking like this...?

sumproduct = height_1 * area_1 * in_1 + height_2 * area_2 * in_2 + height_3 * area_3 * in_3 + ...  ```

For e.g. row1 = 4 * 20 * 1 + 7 * 87 * 1 + 2 * 21 * 1

My Dataframe has around 20 times `height_, area_, in_


Solution

  • Use wide_to_long and product and sum

    s = (pd.wide_to_long(df, i='ID', j='val', 
                         stubnames=['height','area','in'] ,sep='_', suffix='\d+')
           .product(1).sum(level=0))
    
    Out[575]:
    ID
    1     731.0
    2    1214.0
    3    1248.0
    dtype: float64
    

    To assign it to df:

    df['sumproduct'] = df.ID.map(s)
    

    Or

    df['sumproduct'] = s.values
    

    Or

    df['sumproduct'] = s.to_numpy()