I would like to subtract consecutive values in rows in reference to their values in the other columns.
The data frame looks like this:
A B
1 "Sat" 7
2 "Sat" 9
3 "Sat" 5
4 "Sun" 8
5 "Sun" 4
6 "Sun" 6
I would like to subtract consecutive values of the B column in reference to the A column and the output should be like this:
A B
1 "Sat" 0
2 "Sat" +2
3 "Sat" -4
4 "Sun" 0
5 "Sun" -4
6 "Sun" 2
I have been using df.diff()
to get this done in the whole dataframe but here it is like (key, value) with key from A column and several values from the B column. I want to subtract values in B column as long as the key in A column is "Sat" and when it encounter the second key in A column which is "Sun" should restart subtraction.
Any help in solving this problem?
First if necessary sort_values
by column A
, then use groupby
with diff
, for replace NaN
fillna
:
df['B'] = df.sort_values('A').groupby('A')['B'].diff().fillna(0)
print (df)
A B
1 Sat 0.0
2 Sat 2.0
3 Sat -4.0
4 Sun 0.0
5 Sun -4.0
6 Sun 2.0
If need cast to int
:
df['B'] = df.sort_values('A').groupby('A')['B'].diff().fillna(0).astype(int)
print (df)
A B
1 Sat 0
2 Sat 2
3 Sat -4
4 Sun 0
5 Sun -4
6 Sun 2
Another solution with diff
, where add 0
if mask
is not equal with shift
ed column A
by ne
:
df['B'] = df.B.diff().mask(df.A.ne(df.A.shift()), 0)
print (df)
A B
1 Sat 0.0
2 Sat 2.0
3 Sat -4.0
4 Sun 0.0
5 Sun -4.0
6 Sun 2.0