Search code examples
rdate

Generate a list of dates with corresponding option based on overview table


The title for this post is terrible, apologies for that! I am having trouble summarising very briefly.

I have a dataframe with a list of menu options, information for which days of the week the menu is valid, and information on which weeks of the year the menu applies.

Here's a (very) MWE

df <- rbind(
  c("A",1,1,0,1,1,0,0, "20240101", "20240107"),
  c("B",1,1,0,1,1,0,0, "20240108", "20240114"),
  c("C",0,0,1,0,0,0,0, "20240101", "20240107"),
  c("D",0,0,1,0,0,0,0, "20240108", "20240114"),
  c("E",1,1,1,1,1,1,1, "20240101", "20240114")) |> as.data.frame()

names(df) <- c("menu","monday", "tuesday","wednesday", "thursday",
               "friday", "saturday","sunday", "start", "end")

df$start <- as.Date(df$start,
                    format = "%Y%m%d")
df$end <- as.Date(df$end,
                    format = "%Y%m%d")

From this DF, I'd like to generate a df that has two columns: date and menu that applies for those days. Here, this would look something like

DATE          MENU
2024-01-01    A
2024-01-01    E
2024-01-02    A
2024-01-02    E
2024-01-03    C
2024-01-03    E
2024-01-04    A
2024-01-04    E
etc

I thought a start would be to make a list of dates and the day of the week they fall on.

df <- rbind(
  c("A",1,1,0,1,1,0,0, "20240101", "20240107"),
  c("B",1,1,0,1,1,0,0, "20240108", "20240114"),
  c("C",0,0,1,0,0,0,0, "20240101", "20240108"),
  c("D",0,0,1,0,0,0,0, "20240108", "20240114"),
  c("E",1,1,1,1,1,1,1, "20240101", "20240114")) |> as.data.frame()

names(df) <- c("menu","monday", "tuesday","wednesday", "thursday",
               "friday", "saturday","sunday", "start", "end")

df$start <- as.Date(df$start,
                    format = "%Y%m%d")
df$end <- as.Date(df$end,
                    format = "%Y%m%d")

I read up on dates in R (https://stats.oarc.ucla.edu/r/faq/how-does-r-handle-date-values/) and then converted the dates in the first df to the same format as those from the second df (figuring they would be easier to work with after in that format)

df$start <- as.numeric(df$start)
df$end <- as.numeric(df$end)

Now I'm a bit stuck. It's clear what needs to happen theoretically, but really don't know how to implement this. I've Googled around, but since I'm having a hard time precisely describing my issue, while still being general, this hasn't been very successful.

If any existing posts address this, I'll happily be referred to those :)


Solution

  • We can do this by creating a sequence of dates, unnesting it, pivoting, and filtering. I left the extra columns, but you could easily delete them from the result, but they may be useful for auditing the answer. You can also, of course, arrange(date) at the end if you want it sorted by date rather than by menu.

    library(purrr)
    library(dplyr)
    library(tidyr)
    library(lubridate)
    df |>
      mutate(date = map2(start, end, seq.Date, by = "1 day")) |>
      pivot_longer(monday:sunday) |>
      filter(value == 1) |>
      unnest(date) |>
      filter(tolower(weekdays(date)) == name)
    # # A tibble: 24 × 6
    #    menu  start      end        date       name      value
    #    <chr> <date>     <date>     <date>     <chr>     <chr>
    #  1 A     2024-01-01 2024-01-07 2024-01-01 monday    1    
    #  2 A     2024-01-01 2024-01-07 2024-01-02 tuesday   1    
    #  3 A     2024-01-01 2024-01-07 2024-01-04 thursday  1    
    #  4 A     2024-01-01 2024-01-07 2024-01-05 friday    1    
    #  5 B     2024-01-08 2024-01-14 2024-01-08 monday    1    
    #  6 B     2024-01-08 2024-01-14 2024-01-09 tuesday   1    
    #  7 B     2024-01-08 2024-01-14 2024-01-11 thursday  1    
    #  8 B     2024-01-08 2024-01-14 2024-01-12 friday    1    
    #  9 C     2024-01-01 2024-01-07 2024-01-03 wednesday 1    
    # 10 D     2024-01-08 2024-01-14 2024-01-10 wednesday 1    
    # # ℹ 14 more rows
    # # ℹ Use `print(n = ...)` to see more rows