Search code examples
rdatetidyrxtszoo

How to create month-end date series using complete function?


Here is my toy dataset:

df <- tibble::tribble(
         ~date, ~value,
   "2007-01-31",     25,
   "2007-05-31",     31,
   "2007-12-31",     26
  )

I am creating month-end date series using the following code.

df %>% 
  mutate(date = as.Date(date)) %>%
  complete(date = seq(as.Date("2007-01-31"), as.Date("2019-12-31"), by="month"))

However, I am not getting the correct month-end dates.

  date       value
   <date>     <dbl>
 1 2007-01-31    25
 2 2007-03-03    NA
 3 2007-03-31    NA
 4 2007-05-01    NA
 5 2007-05-31    31
 6 2007-07-01    NA
 7 2007-07-31    NA
 8 2007-08-31    NA
 9 2007-10-01    NA
10 2007-10-31    NA
11 2007-12-01    NA
12 2007-12-31    26

What am I missing here? I am okay using other functions from any other package.


Solution

  • No need of complete function, you can do this in base R.

    Since last day of the month is different for different months, we can create a sequence of monthly start dates and subtract 1 day from it.

    seq(as.Date("2007-02-01"), as.Date("2008-01-01"), by="month") - 1
    #[1] "2007-01-31" "2007-02-28" "2007-03-31" "2007-04-30" "2007-05-31" "2007-06-30" 
    #    "2007-07-31" "2007-08-31" "2007-09-30" "2007-10-31" "2007-11-30" "2007-12-31"
    

    Using the same logic in updated dataframe, we can do :

    library(dplyr)
    df %>% 
      mutate(date = as.Date(date)) %>%
      tidyr::complete(date = seq(min(date) + 1, max(date) + 1, by="month") - 1)
    
    #    date       value
    #   <date>     <dbl>
    # 1 2007-01-31    25
    # 2 2007-02-28    NA
    # 3 2007-03-31    NA
    # 4 2007-04-30    NA
    # 5 2007-05-31    31
    # 6 2007-06-30    NA
    # 7 2007-07-31    NA
    # 8 2007-08-31    NA
    # 9 2007-09-30    NA
    #10 2007-10-31    NA
    #11 2007-11-30    NA
    #12 2007-12-31    26