Search code examples
rdataframetime-serieslubridateleap-year

How to repeate the value of the last day of February for a leap year in R?


I have a data.frame that doesn't account for leap year (ie all years are 365 days). I would like to repeat the last day value in February during the leap year. The DF in my code below has fake data set, I intentionally remove the leap day value in DF_NoLeapday. I would like to add a leap day value in DF_NoLeapday by repeating the value of the last day of February in a leap year (in our example it would Feb 28, 2004 value). I would rather like to have a general solution to apply this to many years data.

set.seed(55)
DF <- data.frame(date = seq(as.Date("2003-01-01"), to= as.Date("2005-12-31"), by="day"),
                 A = runif(1096, 0,10),
                 Z = runif(1096,5,15))
DF_NoLeapday <-  DF[!(format(DF$date,"%m") == "02" & format(DF$date, "%d") == "29"),  ,drop = FALSE]

Solution

  • We can use complete on the 'date' column which is already a Date class to expand the rows to fill in the missing dates

    library(dplyr)
    library(tidyr)
    out <- DF_NoLeapday  %>% 
               complete(date = seq(min(date), max(date), by = '1 day'))
    dim(out)
    #[1] 1096    3
    
    out %>% 
        filter(date  >= '2004-02-28', date <= '2004-03-01')
    # A tibble: 3 x 3
    #  date           A     Z
    #  <date>     <dbl> <dbl>
    #1 2004-02-28  9.06  9.70
    #2 2004-02-29 NA    NA   
    #3 2004-03-01  5.30  7.35
    

    By default, the other columns values are filled with NA, if we need to change it to a different value, it can be done within complete with fill

    If we need the previous values, then use fill

    out <- out %>%
              fill(A, Z)
    out %>% 
             filter(date  >= '2004-02-28', date <= '2004-03-01')
    # A tibble: 3 x 3
    #  date           A     Z
    #  <date>     <dbl> <dbl>
    #1 2004-02-28  9.06  9.70
    #2 2004-02-29  9.06  9.70
    #3 2004-03-01  5.30  7.35