I am struggling to find an easier/simpler way to replace a date column in a dataset (tibble) with NA under specific conditions.
Here is a code that actually attemps to do the task. What I want is to replace dates more recent than June 2020 and older than Jan 1900 with missing values, without changing missing values that are already there. But this code is very ugly. Is there a simpler way to do it, specially using tidyverse tools?
library(lubridate)
library(dplyr)
x <- seq(1,5)
y <- c(dmy("04/02/1863", "29/10/1989", "16/03/2000", "14/05/2021", NA))
dat <- tibble(x,y)
dat$y[which(dat$y >= dmy("01/06/2020") | dat$y < dmy("01/01/1900"))] <-
rep(NA, length(dat$y[which(dat$y >= dmy("01/06/2020") | dat$y < dmy("01/01/1900"))]))
dat
You can use an if_else
like so:
library(lubridate)
library(dplyr)
x <- seq(1,5)
y <- c(dmy("04/02/1863", "29/10/1989", "16/03/2000", "14/05/2021", NA))
dat <- tibble(x,y)
dat %>%
mutate(y = if_else(y >= dmy("01/06/2020") | y < dmy("01/01/1900"), NA_Date_, y))
#> # A tibble: 5 x 2
#> x y
#> <int> <date>
#> 1 1 NA
#> 2 2 1989-10-29
#> 3 3 2000-03-16
#> 4 4 NA
#> 5 5 NA