Search code examples
pythonpandasdataframedatetimedayofweek

pandas mean of the data within the same day of week


I have a dataframe ranges from 2016 to 2019, shows the production numbers of a commodity in everyday. I have set the time as index and have converted it to date_time format.

      Time         #Produced products     
    2016-03-26    274
    ....
    2017-03-26    123
    2017-02-27    245
    ....
    2018-03-26    282
    ....
    2019-03-23    296

Now, I would like to find the average daily production over these years, But in a way that is aware of the weekdays. So, since we have 52 weeks in a year, the output indexes will be: 1st Monday, 1st Tuesday,...., 52nd Saturday, 52nd Sunday. For each index, it should find the mean of the produced numbers over the years 2016-2019.

I guess that I should use something such as groupby month and groupby dayofweek I do not know how to implement it. Could you please guide me?


Solution

  • What you can do is add another column to the dataframe using the apply function and a helper function -

    *this helper function assumes that all entries in the Time column are in the python datetime format

    def helper(my_date_time):
        week_list = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday','Sunday']
        week_number = my_date_time.isocalendar()[1]
        week_day = week_list[my_date_time.isocalendar()[2] - 1]
        return (week_day + str(week_number))
    

    Once you have the helper function you can modify your dataframe to include the weekday and week number in another column-

    df['week_number_day'] = df['Time'].apply(helper)
    

    then you can use this to find the mean values you needed

    grouped_df = df.groupby("week_number_day")
    mean_df = grouped_df.mean()
    mean_df = mean_df.reset_index()
    print(mean_df)