Search code examples
raveragepanel

R: Compute monthly averages for daily values


I have the following data which is stored as a data.frame in R: Daily value of product A, B and C from 2018-08-01 until 2019-12-31

Now I would like to compute the monthly average of the value for each product. Additionally, only data for the weekdays but not the weekends should be used to calculate the monthly average for each product. What would be the approach in R to get to the required data?


Solution

  • Here is a solution, using dplyr and tidyr:

    df <- data.frame(Product = c("A", "B", "C"), "Value_2018-08-01" = c(120L, 100L, 90L), 
                         "Value_2018-08-02" = c(80L, 140L, 20L), "Value_2018-08-03" = c(50L, 70L, 200L), 
                         "Value_2018-12-31" = c(50L, 24L, 24L), "Value_2019-01-01" = c(44L, 60L, 29L), 
                         "Value_2019-12-31" = c(99L, 49L, 49L))
    
    
    
    df %>% 
      tidyr::pivot_longer(c(starts_with("Value"))) %>%
      mutate(Date = name,
             Date = sub(".*_", "", Date),
             Date = as.Date(Date, format="%Y.%m.%d"),
             weekday = weekdays(Date)) %>% 
      filter(!weekday %in% c("Samstag", "Sonntag")) %>% 
      group_by(Product, format(Date, "%m")) %>% 
      summarize(mean(value)) %>% 
      as.data.frame()
    
      Product format(Date, "%m") mean(value)
    1       A                 01    44.00000
    2       A                 08    83.33333
    3       A                 12    74.50000
    4       B                 01    60.00000
    5       B                 08   103.33333
    6       B                 12    36.50000
    7       C                 01    29.00000
    8       C                 08   103.33333
    9       C                 12    36.50000
    

    Note that Samstag and Sonntag should be changed to the names of the weekend days in the language of your working system.

    Also, I've calculated the monthly averages as you asked for it. However, if you want to have monthly averages per year, you should change group_by(Product, format(Date, "%m"))to group_by(Product, format(Date, "%m"),format(Date, "%Y")).