I have a following data frame which I converted to pandas pivot table having two indexes "Date" and "Rating. The values are sorted in columns A, B and C.
I would like to find a solution which will subtract the values for each column and rating for consecutive days. Say, the change in A from 03/01/2007 to 02/01/2007 for rating M would be 0.4179 - 0.4256 = -0.0077
. The subtraction won't always be performed on a one day difference. But it will always be the (new date - the old date).
The results I'm looking for can be found in the table below:
If your dataframe is correctly sorted (or use df.sort_values('Date')
), you can use groupby_diff
:
# Replace ['A'] with ['A', 'B', 'C']
df['A_diff'] = df.groupby('Rating')['A'].diff().fillna(0)
Output:
>>> df
Date Rating A A_diff
0 02/01/2007 M 0.4256 0.0000
1 02/01/2007 MM 0.4358 0.0000
2 02/01/2007 MMM 0.4471 0.0000
3 03/01/2007 M 0.4179 -0.0077
4 03/01/2007 MM 0.4325 -0.0033
5 03/01/2007 MMM 0.4476 0.0005
6 04/01/2007 M 0.4173 -0.0006
7 04/01/2007 MM 0.4316 -0.0009
8 04/01/2007 MMM 0.4469 -0.0007
If you don't know how many columns you have, you can try:
cols = df.columns[2:]
df[cols] = df.groupby('Rating')[cols].diff().fillna(0)