python pandas dataframe rolling-computation

Pandas Rolling window with filtering condition to remove the some latest data

This is a follow-up question of this. I would like to perform a rolling window of the last n days but I want to filter out the latest x days from each window (x is smaller than n)

Here is an example:

d = {'Name': ['Jack', 'Jim', 'Jack', 'Jim', 'Jack', 'Jack', 'Jim', 'Jack', 'Jane', 'Jane'],
     'Date': ['08/01/2021',
              '27/01/2021',
              '05/02/2021',
              '10/02/2021',
              '17/02/2021',
              '18/02/2021',
              '20/02/2021',
              '21/02/2021',
              '22/02/2021',
              '29/03/2021'],
     'Earning': [40, 10, 20, 20, 40, 50, 100, 70, 80, 90]}

df = pd.DataFrame(data=d)
df['Date'] = pd.to_datetime(df.Date, format='%d/%m/%Y')
df = df.sort_values('Date')

   Name       Date  Earning
0  Jack 2021-01-08       40
1   Jim 2021-01-27       10
2  Jack 2021-02-05       20
3   Jim 2021-02-10       20
4  Jack 2021-02-17       40
5  Jack 2021-02-18       50
6   Jim 2021-02-20      100
7  Jack 2021-02-21       70
8  Jane 2021-02-22       80
9  Jane 2021-03-29       90

I would like to

For each row, take the last 30 days of the same Name - call it a window
Remove the latest 20 days of each window (i.e. only take the earliest 10 days)
Calculate the sum on the Earning column

Expected outcome: (The two columns Window_From and Window_To are not needed. I only use them to demonstrate the mock data)

   Name       Date  Earning Window_From  Window_To   Sum
0  Jack 2021-01-08       40  2020-12-09 2020-12-19   0.0
1   Jim 2021-01-27       10  2020-12-28 2021-01-07   0.0
2  Jack 2021-02-05       20  2021-01-06 2021-01-16  40.0
3   Jim 2021-02-10       20  2021-01-11 2021-01-21   0.0
4  Jack 2021-02-17       40  2021-01-18 2021-01-28   0.0
5  Jack 2021-02-18       50  2021-01-19 2021-01-29   0.0
6   Jim 2021-02-20      100  2021-01-21 2021-01-31  10.0
7  Jack 2021-02-21       70  2021-01-22 2021-02-01   0.0
8  Jane 2021-02-22       80  2021-01-23 2021-02-02   0.0
9  Jane 2021-03-29       90  2021-02-27 2021-03-09   0.0

Solution

Easy solution

Calculate 30 days and 20 days rolling sum then subtract 30 day sum from 20 day sum to get the effective rolling sum for first 10 days

s1 = df.groupby('Name').rolling('30d', on='Date')['Earning'].sum()
s2 = df.groupby('Name').rolling('20d', on='Date')['Earning'].sum()

df.merge(s1.sub(s2).reset_index(name='sum'), how='left')

   Name       Date  Earning   sum
0  Jack 2021-01-08       40   0.0
1   Jim 2021-01-27       10   0.0
2  Jack 2021-02-05       20  40.0
3   Jim 2021-02-10       20   0.0
4  Jack 2021-02-17       40   0.0
5  Jack 2021-02-18       50   0.0
6   Jim 2021-02-20      100  10.0
7  Jack 2021-02-21       70   0.0
8  Jane 2021-02-22       80   0.0
9  Jane 2021-03-29       90   0.0