I have a dataframe ranges from 2016 to 2019, shows the production numbers of a commodity in everyday. I have set the time as index and have converted it to date_time format.
Time #Produced products 2016-03-26 274 .... 2017-03-26 123 2017-02-27 245 .... 2018-03-26 282 .... 2019-03-23 296
Now, I would like to find the average daily production over these years, But in a way that is aware of the weekdays. So, since we have 52 weeks in a year, the output indexes will be: 1st Monday, 1st Tuesday,...., 52nd Saturday, 52nd Sunday. For each index, it should find the mean of the produced numbers over the years 2016-2019.
I guess that I should use something such as groupby month and groupby dayofweek I do not know how to implement it. Could you please guide me?
What you can do is add another column to the dataframe using the apply
function and a helper function -
*this helper function assumes that all entries in the Time
column are in the python datetime
format
def helper(my_date_time):
week_list = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday','Sunday']
week_number = my_date_time.isocalendar()[1]
week_day = week_list[my_date_time.isocalendar()[2] - 1]
return (week_day + str(week_number))
Once you have the helper function you can modify your dataframe to include the weekday and week number in another column-
df['week_number_day'] = df['Time'].apply(helper)
then you can use this to find the mean values you needed
grouped_df = df.groupby("week_number_day")
mean_df = grouped_df.mean()
mean_df = mean_df.reset_index()
print(mean_df)