python pandas dataframe time-series cumsum

Pandas time series cumsum of previous records matching a column of current row

I'm trying to add a column that represents the cumulative sum of the previous records that match the current value of the row for a given column, let's take an example.

My data is :

|t         |label|qty|cumsum|
|2020-01-05|A    |10 |10    |
|2020-01-06|B    |5  |5     |
|2020-01-06|C    |2  |2     |
|2020-01-07|A    |-5 |5     |
|2020-01-07|A    |2  |7     |
|2020-01-08|B    |-30|-25   |
|2020-01-09|C    |5  |7     |

The cumsum column would be summarized as "do the cumulative sum of all past records where label is the same as the current row"

Ideally for code clarity, I'd like an expression such as df['cumsum']=df.loc[(df['label']==current label) & (df['t'] <= current index)].cumsum() that doesn't go explicitly through looping each previous columns in a for ix,row in df:

Another aspect is that the cumulative sum is done from the start, may the dataframe be 20 records long or 500k records long.

Also, for clarity I just put 3 labels in my example but I can have dozens of labels in my dataset

I found some other similar issues but they don't exactly suit my need :

Python Pandas: Cumulative Sum based on multiple conditions => It 'just' count the number of time where a condition is true in the past, not a cumulative sum of a specific column
Pandas temporal cumulative sum by group => Doesn't keep the record "live" for each line of the dataframe

Any kind of help is welcome ! Thanks !

Solution

df.groupby('label')['qty'].transform('cumsum')