Search code examples
pythonpandaslambdatransformsubtraction

Create a new column based on a conditional subtraction in python


I'm trying to create a new column based on a conditional subtraction in python. I want to first group the dataframe by column A, then take the row value of C where B equals 2, and subtract that value from all values in column C.

import pandas as pd

data = [
["R", 1, 2],
["R", 2, 4],
["R", 3, 6],
["R", 4, 8],
["S", 0, 5],
["S", 1, 4],
["S", 2, 1],
["S", 3, 3]]

df = pd.DataFrame(data=data, columns=["a", "b", "c"])
df

Out[1]:
    a   b   c   
0   R   1   2   
1   R   2   4   
2   R   3   6   
3   R   4   8   
4   S   0   5   
5   S   1   4   
6   S   2   1   
7   S   3   3   

Want it to yield the column 'd' of:

Out[2]:
    a   b   c   d   
0   R   1   2  -2
1   R   2   4   0 
2   R   3   6   2
3   R   4   8   4
4   S   0   5   4
5   S   1   4   3
6   S   2   1   0 
7   S   3   3   2

I would appreciate any tips of how to use transform or map functions to solve the problem.


Solution

  • It's easiest to temporarily set the index to a. Then you can do the subtraction as usual while pandas will automatically align the index. Finally reset the index.

    df1 = df.set_index('a')
    df1['d'] = df1.c - df1.loc[df1.b.eq(2), 'c']
    df1.reset_index()
    

    Result:

       a  b  c  d
    0  R  1  2 -2
    1  R  2  4  0
    2  R  3  6  2
    3  R  4  8  4
    4  S  0  5  4
    5  S  1  4  3
    6  S  2  1  0
    7  S  3  3  2
    

    (this of course only works if there is no more than one row for each value in a where b equals 2)