Search code examples
pythonpandasdataframepivot-table

Pandas subtraction for multiindex pivot table


I have a following data frame which I converted to pandas pivot table having two indexes "Date" and "Rating. The values are sorted in columns A, B and C.

Pivot data

I would like to find a solution which will subtract the values for each column and rating for consecutive days. Say, the change in A from 03/01/2007 to 02/01/2007 for rating M would be 0.4179 - 0.4256 = -0.0077. The subtraction won't always be performed on a one day difference. But it will always be the (new date - the old date).

The results I'm looking for can be found in the table below:

enter image description here


Solution

  • If your dataframe is correctly sorted (or use df.sort_values('Date')), you can use groupby_diff:

    # Replace ['A'] with ['A', 'B', 'C']
    df['A_diff'] = df.groupby('Rating')['A'].diff().fillna(0)
    

    Output:

    >>> df
             Date Rating       A  A_diff
    0  02/01/2007      M  0.4256  0.0000
    1  02/01/2007     MM  0.4358  0.0000
    2  02/01/2007    MMM  0.4471  0.0000
    3  03/01/2007      M  0.4179 -0.0077
    4  03/01/2007     MM  0.4325 -0.0033
    5  03/01/2007    MMM  0.4476  0.0005
    6  04/01/2007      M  0.4173 -0.0006
    7  04/01/2007     MM  0.4316 -0.0009
    8  04/01/2007    MMM  0.4469 -0.0007
    

    If you don't know how many columns you have, you can try:

    cols = df.columns[2:]
    df[cols] = df.groupby('Rating')[cols].diff().fillna(0)