Search code examples
rtidyverselubridate

Unstack lubridate's interval class


I am trying to transform a dataframe df consisting of a value-column, two date columns (start and end), and a interval-column (duration) into the long format by unnesting/unstacking the duration column.

library(dplyr)
library(lubridate)

df <- data.frame(value = letters[1:3], start = as_date(1:3), end = as_date(3:1)+3) %>% 
          mutate(duration = interval(start, end))

The expected outcome would a dataframe where value, start, and end are duplicated for each day as defined by duration. For instance, the value 'a' would appear 6 times each time on a different day (2nd, 3rd, 4th, 5th, 6th, 7th January 1970).

I tried to use the unnest function from the tidyr package but nothing happened.

tidyr::unnest(df, duration) 

Any help is greatly appreciated :)


Solution

  • You can't unstack a column of intervals and expect it to generate all dates in between, but by using seq you can generate them yourself. Try this:

    library(tidyverse)
    library(lubridate)
    
    df %>%
      rowwise() %>% 
      summarise(
        value, dates = seq(start, end, by = 1)
      )
    
    
    #> # A tibble: 12 x 2
    #>    value dates     
    #>    <chr> <date>    
    #>  1 a     1970-01-02
    #>  2 a     1970-01-03
    #>  3 a     1970-01-04
    #>  4 a     1970-01-05
    #>  5 a     1970-01-06
    #>  6 a     1970-01-07
    #>  7 b     1970-01-03
    #>  8 b     1970-01-04
    #>  9 b     1970-01-05
    #> 10 b     1970-01-06
    #> 11 c     1970-01-04
    #> 12 c     1970-01-05
    

    Created on 2021-05-18 by the reprex package (v1.0.0)