Search code examples
pythonaveragedate-rangepandas

get the average for specific range of dates in pandas


I need to group the data by websites and get the average of views for the specific range of dates. My data looks like this:

date        website         amount_views
1/1/2021        a               23
1/2/2021        a               17
1/3/2021        a               10
1/4/2021        a               25
1/5/2021        a               2
1/1/2021        b               12
1/2/2021        b               7
1/3/2021        b               5
1/4/2021        b               17
1/5/2021        b               2

So I need to see what is the average for a and b websites for two ranges of date (1/1/2021 - 1/3/2021 (pre) AND 1/3/2021 - 1/5/2021 (post)) The desired output is:

date        website         avg_amount_views
pre            a                 31.5
post           a                 35.6
pre            b                 15.5
post           b                 22.6

Solution

  • You can use np.where and date.between to allocate the pre and post status and group by the same and websites and find mean.

    In one line(though not so readable):

      df['date']=pd.to_datetime(df['date'])
      df.groupby([np.where(df['date'].between('1/1/2021','1/3/2021'),'pre'\
      ,'post'),'website'])['amount_views'].mean().to_frame('mean')
    

    Step by step (more readable):

    df['date']=pd.to_datetime(df['date'])
    df['status']=np.where(df['date'].between('1/1/2021','1/3/2021'),'pre','post')
    df.groupby(['status','website'])['amount_views'].mean().to_frame('mean')
    
                         mean
    status website           
    post   a        13.500000
           b         9.500000
    pre    a        16.666667
           b         8.000000