Search code examples
rdategroup-bytidyverselubridate

Group by different date ranges in R?


If I have data set with daily data, how I can I group_by a custom set of date ranges? The below screen grab details the of the date ranges I'd like to group by. Any help would be greatly appreciated.

Thank you

enter image description here


Solution

  • You can achieve this by creating a data.frame where you can define the date ranges above, Here I have generated data to illustrate this:

    library(dplyr)
    library(lubridate)
    
    # random data 
    dates <- seq(as.Date("2020-01-01"), as.Date("2022-12-31"), by = "day")
    df <- data.frame(date = dates)
    
    # define date ranges
    ranges <- data.frame(
      start = ymd(c("2019-06-19", "2020-06-20", "2021-06-21", "2022-06-22")),
      end = ymd(c("2020-05-20", "2021-05-21", "2022-05-22", "2023-05-23")),
      group = c("Jun 19 - May 20", "Jun 20 - May 21", "Jun 21 - May 22", "Jun 22 - May 23")
    )
    
    # breaks for the ranges
    breaks <- c(ranges$start, ranges$end[nrow(ranges)])
    
    # label for range groups
    labels <- ranges$group
    
    # group data by date ranges
    df <- df %>%
      mutate(group = cut(date, breaks = breaks, labels = labels, include.lowest = TRUE)) %>% 
      group_by(group) %>% 
      summarise(mean_value = mean(value))