pandas pandas-groupby nested-lists cumulative-sum

Pandas function to group by cumulative sum and return another column when a certain amount is reached

Here it is my problem. I got a dataframe like this:

ID   item   amount   level
1     1      10       5
1     1      10       10
2     4      15       5
2     9      30       8
2     4      10       10
2     4      10       20
3     4      10       4
3     4      10       6

and I need to know, per each id, at what level the cumulative sum of each item reaches a fixed amount. For example, If I need to know the first time when a given items reach an amount of 20 or more for a user. I would like to have something like:

ID   item   amount   level
1     1      10       5
1     1      20       10
2     4      15       5
2     9      30       8
2     4      25       10
2     4      40       20
3     4      10       4
3     4      20       6

and then something like a list or a dictionary in which I can store the results. for example:

d[item_number] = [list_of_levels_per_id_when_20_is_reached]

In this example:

{1: [10], 4: [10,6], 9: [8]}

Solution

cumsum

You can perform the cumsum post group with:

df['amount_cumsum'] = df.groupby(['ID', 'item'])['amount'].cumsum()

Output (as separate column for clarity):

   ID  item  amount  level  amount_cumsum
0   1     1      10      5             10
1   1     1      10     10             20
2   2     4      15      5             15
3   2     9      30      8             30
4   2     4      10     10             25
5   3     4      10      4             10
6   3     4      10      6             20

dictionary

(df[df['amount_cumsum'].ge(20)]
 .groupby(['item'])['level'].agg(list)
 .to_dict()
 )

Output:

{1: [10], 4: [10, 6], 9: [8]}