Search code examples
pythonpandasdataframedata-analysis

Calculate the delta between entries in Pandas using partitions


I'm using Dataframe in Pandas, and I would like to calculate the delta between each adjacent rows, using a partition.

For example, this is my initial set after sorting it by A and B:

    A   B    
1   12  40
2   12  50
3   12  65
4   23  30
5   23  45
6   23  60

I want to calculate the delta between adjacent B values, partitioned by A. If we define C as result, the final table should look like this:

    A   B   C   
1   12  40  NaN
2   12  50  10
3   12  65  15
4   23  30  NaN
5   23  45  15
6   23  75  30

The reason for the NaN is that we cannot calculate delta for the minimum number in each partition.


Solution

  • You can group by column A and take the difference:

    df['C'] = df.groupby('A')['B'].diff()
    
    df
    Out: 
        A   B     C
    1  12  40   NaN
    2  12  50  10.0
    3  12  65  15.0
    4  23  30   NaN
    5  23  45  15.0
    6  23  60  15.0