Search code examples
rdatedummy-data

Split 52 weeks of data into 12 months


I have a data.frame of exercise sessions completed for multiple participants (G1, G2 etc.) There a different types of exercise session (Sup, Home, etc.) with 2 sessions of each type per week. I have a "week" variable, but I need to give summary data by month, i.e. take 52 weeks of data and split it into 12 months. This is example data:

qdf = data.frame(id = rep(c("G1", "G2", "G3"), 16),
                 type = c(rep("Sup",24), rep("Home", 24)),
                 week = rep(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4),2),
                 session = c(rep("S1",3), rep("S2", 3), rep("S3",3), rep("S4",3),
                             rep("S5",3), rep("S6", 3), rep("S7",3), rep("S8",3),
                             rep("H1",3), rep("H2", 3), rep("H3",3), rep("H4",3),
                             rep("H5",3), rep("H6", 3), rep("H7",3), rep("H8",3)))

My current idea on how to do this would be, after grouping by id, to assign a dummy date to each session, starting with 01-01-2020 for sessions S1 and H1, and then splitting the year into months based on dates. Alternatively, to add extra dummy rows, so that each week has 7 days, then assign dates and split.

I'm really not sure how to begin with either of these possible solutions, or if there isn't a better way.


Solution

  • If you set a starting point, for instance 2020-01-01, you can do the following:

    qdf <- data.frame(id = rep(c("G1", "G2", "G3"), 16),
                     type = c(rep("Sup",24), rep("Home", 24)),
                     week = rep(c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4),2),
                     session = c(rep("S1",3), rep("S2", 3), rep("S3",3), rep("S4",3),
                                 rep("S5",3), rep("S6", 3), rep("S7",3), rep("S8",3),
                                 rep("H1",3), rep("H2", 3), rep("H3",3), rep("H4",3),
                                 rep("H5",3), rep("H6", 3), rep("H7",3), rep("H8",3)))
    
    library(lubridate)
    qdf <- qdf %>% mutate(
      date = ymd("2020-01-01") + weeks(week),
      month = month(date)
    )
    > head(qdf)
      id type week session month       date
    1 G1  Sup    1      S1     1 2020-01-08
    2 G2  Sup    1      S1     1 2020-01-08
    3 G3  Sup    1      S1     1 2020-01-08
    4 G1  Sup    1      S2     1 2020-01-08
    5 G2  Sup    1      S2     1 2020-01-08
    6 G3  Sup    1      S2     1 2020-01-08
    

    I haven't removed the date column on purpose so you can check what's going on.