I have a table like below containing values for multiple IDs:
ID | value | date |
---|---|---|
1 | 20 | 2022-01-01 12:20 |
2 | 25 | 2022-01-04 18:20 |
1 | 10 | 2022-01-04 11:20 |
1 | 150 | 2022-01-06 16:20 |
2 | 200 | 2022-01-08 13:20 |
3 | 40 | 2022-01-04 21:20 |
1 | 75 | 2022-01-09 08:20 |
I would like to calculate week wise sum of values for all IDs:
The start date is given (for example, 01-01-2022).
Weeks are calculated based on range:
ID | Week 1 sum | Week 2 sum | Week 3 sum | ... |
---|---|---|---|---|
1 | 180 | 75 | -- | -- |
2 | 25 | 200 | -- | -- |
3 | 40 | -- | -- | -- |
There's a pandas function (pd.Grouper
) that allows you to specify a groupby instruction.1 In this case, that specification is to "resample" date by a weekly frequency that starts on Fridays.2 Since you also need to group by ID
as well, add it to the grouper.
# convert to datetime
df['date'] = pd.to_datetime(df['date'])
# pivot the dataframe
df1 = (
df.groupby(['ID', pd.Grouper(key='date', freq='W-FRI')])['value'].sum()
.unstack(fill_value=0)
)
# rename columns
df1.columns = [f"Week {c} sum" for c in range(1, df1.shape[1]+1)]
df1 = df1.reset_index()
1 What you actually need is a pivot_table
result but groupby
+ unstack
is equivalent to pivot_table
and groupby
+ unstack
is more convenient here.
2 Because Jan 1, 2022 is a Saturday, you need to specify the anchor on Friday.