I try to subtract a value (50) from a subselection of cells in a pandas DataFrame. I want to subtract the value from ‘rt’ where subj == 1 and cond == std. I would like to perform this calculation in place so that the other values stay untouched.
Lets say I have the following DataFrame:
data = {'subj': [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,],
'rt': [100, 102, 101, 100, 101, 101, 105, 105, 106, 104, 104, 106],
'cond':['nov', 'std', 'std', 'emo', 'std', 'emo', 'nov', 'std', 'std',
'emo', 'std', 'emo']}
df = pd.DataFrame(data)
subj rt cond
0 1 100 nov
1 1 102 std
2 1 101 std
3 1 100 emo
4 1 101 std
5 1 101 emo
6 2 105 nov
7 2 105 std
8 2 106 std
9 2 104 emo
10 2 104 std
11 2 106 emo
Now I want to subtract the value 50 from the 'rt' that meet the criteria subj == 1 and cond == std. I am using the following code to subtract the value.
df['rt'] = df[(df['subj'] == 1) & (df['cond'] == 'std')]['rt'].subtract(50)
This is, what I am expecting:
subj rt cond
0 1 100 nov
1 1 52 std
2 1 51 std
3 1 100 emo
4 1 51 std
5 1 101 emo
6 2 105 nov
7 2 105 std
8 2 106 std
9 2 104 emo
10 2 104 std
11 2 106 emo
Instead this is what i get:
subj rt cond
0 1 NaN nov
1 1 2.0 std
2 1 1.0 std
3 1 NaN emo
4 1 1.0 std
5 1 NaN emo
6 2 NaN nov
7 2 NaN std
8 2 NaN std
9 2 NaN emo
10 2 NaN std
11 2 NaN emo
How can I keep the remaining values of the rt column instead of having NaN? I want to subtract the other rt values by other values in the same manner without creating separate DataFrames for each condition.
Use slicing with loc
to do it in place efficiently (only the 3 matching values will be computed):
df.loc[df['subj'].eq(1)&df['cond'].eq('std'), 'rt'] -= 50
output:
subj rt cond
0 1 100 nov
1 1 52 std
2 1 51 std
3 1 100 emo
4 1 51 std
5 1 101 emo
6 2 105 nov
7 2 105 std
8 2 106 std
9 2 104 emo
10 2 104 std
11 2 106 emo