Here is my toy dataset:
df <- tibble::tribble(
~date, ~value,
"2007-01-31", 25,
"2007-05-31", 31,
"2007-12-31", 26
)
I am creating month-end date series using the following code.
df %>%
mutate(date = as.Date(date)) %>%
complete(date = seq(as.Date("2007-01-31"), as.Date("2019-12-31"), by="month"))
However, I am not getting the correct month-end dates.
date value
<date> <dbl>
1 2007-01-31 25
2 2007-03-03 NA
3 2007-03-31 NA
4 2007-05-01 NA
5 2007-05-31 31
6 2007-07-01 NA
7 2007-07-31 NA
8 2007-08-31 NA
9 2007-10-01 NA
10 2007-10-31 NA
11 2007-12-01 NA
12 2007-12-31 26
What am I missing here? I am okay using other functions from any other package.
No need of complete
function, you can do this in base R.
Since last day of the month is different for different months, we can create a sequence of monthly start dates and subtract 1 day from it.
seq(as.Date("2007-02-01"), as.Date("2008-01-01"), by="month") - 1
#[1] "2007-01-31" "2007-02-28" "2007-03-31" "2007-04-30" "2007-05-31" "2007-06-30"
# "2007-07-31" "2007-08-31" "2007-09-30" "2007-10-31" "2007-11-30" "2007-12-31"
Using the same logic in updated dataframe, we can do :
library(dplyr)
df %>%
mutate(date = as.Date(date)) %>%
tidyr::complete(date = seq(min(date) + 1, max(date) + 1, by="month") - 1)
# date value
# <date> <dbl>
# 1 2007-01-31 25
# 2 2007-02-28 NA
# 3 2007-03-31 NA
# 4 2007-04-30 NA
# 5 2007-05-31 31
# 6 2007-06-30 NA
# 7 2007-07-31 NA
# 8 2007-08-31 NA
# 9 2007-09-30 NA
#10 2007-10-31 NA
#11 2007-11-30 NA
#12 2007-12-31 26