Search code examples
pythonpandasmultiplication

How do I multiply values of a datraframe column by values from a column from another dataframe based on a common category?


I have two dataframes:

data1 = {'Item': ['A', 'B', 'C', 'N'], 'Price': [1, 2, 3, 10], 'Category': ['X', 'Y', 'X', 'Z'], 'County': ['K', 'L', 'L', 'K']}
df1 = pd.DataFrame(data1)
df1

data2 = {'Category': ['X', 'Y', 'Z'], 'Value retained': [0.1, 0.2, 0.8]}
df2 = pd.DataFrame(data2)
df2

How do I multiply 'Value retained' by 'Price' following their respective Category and add the result as a new column in df1?

I've searched a lot for a solution and tried several different things, among them:

df3 = df1
for cat, VR in df2['Category', 'Value retained']:
    if cat in df1.columns:
        df3[cat] = df1['Price'] * VR

and

df3 = df1['Price'] * df2.set_index('Category')['Value retained']
df3

In my real dataframe I have 250k+ items and 32 categories with different values of 'value retained'.

I really appreciate any help for a newbie in Python coding.


Solution

  • You're 2nd approach would work if both dataframes have Category as index, but since you can't set_index on Category in df1 (because you have duplicated entries) you need to do a left merge on the two df based on the column Category and then multiply.

    df3 = df1.merge(df2, on='Category', how='left')
    df3['result'] = df3['Price'] * df3['Value retained']
    print(df3)
    
      Item  Price Category County  Value retained  result
    0    A      1        X      K             0.1     0.1
    1    B      2        Y      L             0.2     0.4
    2    C      3        X      L             0.1     0.3
    3    N     10        Z      K             0.8     8.0