Here it is my problem. I got a dataframe like this:
ID item amount level
1 1 10 5
1 1 10 10
2 4 15 5
2 9 30 8
2 4 10 10
2 4 10 20
3 4 10 4
3 4 10 6
and I need to know, per each id, at what level the cumulative sum of each item reaches a fixed amount. For example, If I need to know the first time when a given items reach an amount of 20 or more for a user. I would like to have something like:
ID item amount level
1 1 10 5
1 1 20 10
2 4 15 5
2 9 30 8
2 4 25 10
2 4 40 20
3 4 10 4
3 4 20 6
and then something like a list or a dictionary in which I can store the results. for example:
d[item_number] = [list_of_levels_per_id_when_20_is_reached]
In this example:
{1: [10], 4: [10,6], 9: [8]}
You can perform the cumsum post group with:
df['amount_cumsum'] = df.groupby(['ID', 'item'])['amount'].cumsum()
Output (as separate column for clarity):
ID item amount level amount_cumsum
0 1 1 10 5 10
1 1 1 10 10 20
2 2 4 15 5 15
3 2 9 30 8 30
4 2 4 10 10 25
5 3 4 10 4 10
6 3 4 10 6 20
(df[df['amount_cumsum'].ge(20)]
.groupby(['item'])['level'].agg(list)
.to_dict()
)
Output:
{1: [10], 4: [10, 6], 9: [8]}