Search code examples
pythonpandasselectpandas-groupbycalculated-columns

Pandas: calculate first purchase amount


I need to calculate the first purchase amount for every client. This is my code:

ticket.groupby(['user_reference_id','total_amount']).reference_date.min().reset_index()``

And i have this result:

user_reference_id total_amount reference_date*

    159                  12.77       2019-06-17

    159                  19.73       2019-06-17

    159                  21.55       2019-06-18

    159                  49.15       2019-06-16

    159                  49.66       2019-06-14

enter image description here

I need it grouped by user_reference_id with the minimum reference_date (first date when a customer made the purchase) and corresponding total_amount. In this case i need the next output: reference_date 2019-06-14, user_reference_id 159 and total_amount 49.66


Solution

  • one option would be:

    step 1: sort by id and date

    df=df.sort_values(['user_reference_id', 'reference_date'], ascending=True)
    

    Step 2: as df is sorted, get the first value using groupby.agg function.

    df_result=df.groupby('user_reference_id',as_index=False).agg({'reference_date':'first', 'total_amount':'first'})