I'm working with a data frame with dates and times. I need to set certain date columns to NA
, but when I use the mutate
function, the date columns turn into double columns!
Example:
df
## my_date_01 my_date_02
## <date> <date>
## 1 2018-09-13 NA
## 2 NA 2018-12-09
## 3 NA 2018-12-16
## 4 NA 2019-01-19
## 5 NA 2018-11-19
## 6 NA 2018-12-16
## 7 NA 2018-08-08
## 8 NA 2018-10-24
## 9 2019-03-26 NA
##10 NA 2018-12-11
df %>% mutate(my_date_01 = ifelse(my_date_01 > lubridate::date("2019-01-01"),
NA,
my_date_01))
## my_date_01 my_date_02
## <dbl> <date>
## 1 17787 NA
## 2 NA 2018-12-09
## 3 NA 2018-12-16
## 4 NA 2019-01-19
## 5 NA 2018-11-19
## 6 NA 2018-12-16
## 7 NA 2018-08-08
## 8 NA 2018-10-24
## 9 NA NA
##10 NA 2018-12-11
Two things are happening here:
NA
(this is Ok), butWhat am I missing?
We can wrap as.Date(NA)
to get the same type for while doing the comparison
library(dplyr)
df %>%
mutate(my_date_01 = case_when(my_date_01 > lubridate::date("2019-01-01")
~ as.Date(NA),
TRUE ~my_date_01))
# my_date_01 my_date_02
#1 2018-09-13 <NA>
#2 <NA> 2018-12-09
#3 <NA> 2018-12-16
#4 <NA> 2019-01-19
#5 <NA> 2018-11-19
#6 <NA> 2018-12-16
#7 <NA> 2018-08-08
#8 <NA> 2018-10-24
#9 <NA> <NA>
#10 <NA> 2018-12-11
The fix above may still not work with ifelse
because in the source code of ifelse
...
ans <- test # assigned to logical vector test
len <- length(ans)
ypos <- which(test)
npos <- which(!test)
if (length(ypos) > 0L)
ans[ypos] <- rep(yes, length.out = len)[ypos]
if (length(npos) > 0L)
ans[npos] <- rep(no, length.out = len)[npos]
ans
}
If instead it is
ans <- as.Date(rep(NA, length(test)))
it would work with the code below because the logical vector is coerced to numeric while doing the assignment which is not happening with the as.Date
step above
df %>%
mutate(my_date_01 = ifelsenew(my_date_01 > lubridate::date("2019-01-01"),
as.Date(NA),
my_date_01))
df <- structure(list(my_date_01 = structure(c(17787, NA, NA, NA, NA,
NA, NA, NA, 17981, NA), class = "Date"), my_date_02 = structure(c(NA,
17874, 17881, 17915, 17854, 17881, 17751, 17828, NA, 17876), class = "Date")),
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")