I need to group the data by websites and get the average of views for the specific range of dates. My data looks like this:
date website amount_views
1/1/2021 a 23
1/2/2021 a 17
1/3/2021 a 10
1/4/2021 a 25
1/5/2021 a 2
1/1/2021 b 12
1/2/2021 b 7
1/3/2021 b 5
1/4/2021 b 17
1/5/2021 b 2
So I need to see what is the average for a and b websites for two ranges of date (1/1/2021 - 1/3/2021 (pre) AND 1/3/2021 - 1/5/2021 (post)) The desired output is:
date website avg_amount_views
pre a 31.5
post a 35.6
pre b 15.5
post b 22.6
You can use np.where and date.between to allocate the pre and post status and group by the same and websites and find mean.
In one line(though not so readable):
df['date']=pd.to_datetime(df['date'])
df.groupby([np.where(df['date'].between('1/1/2021','1/3/2021'),'pre'\
,'post'),'website'])['amount_views'].mean().to_frame('mean')
Step by step (more readable):
df['date']=pd.to_datetime(df['date'])
df['status']=np.where(df['date'].between('1/1/2021','1/3/2021'),'pre','post')
df.groupby(['status','website'])['amount_views'].mean().to_frame('mean')
mean
status website
post a 13.500000
b 9.500000
pre a 16.666667
b 8.000000