Search code examples
pythonpandasdataframegroup-by

Subtracting pandas series from all elements of another pandas series with a common ID


I have a pandas series.groupby objects, call it data. If I print out the elements, it looks like this:

<pandas.core.groupby.generic.SeriesGroupBy object at ***>
(1, 0     397.44
    1     12.72
    2     422.40
Name: value, dtype: float64)
(2, 3     398.88
    4     6.48
    5     413.52
Name: value, dtype: float64)
(3, 6     398.40
    7     68.40
    8     18.96
    9     56.64
    10    406.56
Name: value, dtype: float64)
(4, 11    398.64
    12    14.64
    13    413.76
Name: value, dtype: float64)
...

I want to make an equivalent object, where the entries are the cumulative sum of each sublist in the series, minus the first entry of that list. So, for example, the first element would become:

(1, 0     0         #(= 397.44 - 397.44)
    1     12.72     #(= 397.44 + 12.72 - 397.44)
    2     435.12    #(= 397.44 + 12.72 + 422.40 - 397.44)

I can get the cumulative sum easily enough using apply:

cumulative_sums = data.apply(lambda x: x.cumsum())

but when I try to subtract the first element of the list in what I would think of as the intuitive way (lambda x: x.cumsum()-x[0]) , I get a KeyError. How can I achieve what I am trying to do?


Solution

  • Try:

    cumulative_sums = data.apply(lambda x: x.cumsum() - x.iat[0])
    print(cumulative_sums)
    

    Prints:

    a  b 
    1  0       0.00
       1      12.72
       2     435.12
    2  3       0.00
       4       6.48
       5     420.00
    3  6       0.00
       7      68.40
       8      87.36
       9     144.00
       10    550.56
    Name: value, dtype: float64