Search code examples
rdatedplyrtidyverselubridate

How to replace date column with NAs with an if condition


I am struggling to find an easier/simpler way to replace a date column in a dataset (tibble) with NA under specific conditions.

Here is a code that actually attemps to do the task. What I want is to replace dates more recent than June 2020 and older than Jan 1900 with missing values, without changing missing values that are already there. But this code is very ugly. Is there a simpler way to do it, specially using tidyverse tools?

library(lubridate)
library(dplyr)

x <- seq(1,5)
y <- c(dmy("04/02/1863", "29/10/1989", "16/03/2000", "14/05/2021", NA))
dat <- tibble(x,y)

dat$y[which(dat$y >= dmy("01/06/2020") | dat$y < dmy("01/01/1900"))] <- 
  rep(NA, length(dat$y[which(dat$y >= dmy("01/06/2020") | dat$y < dmy("01/01/1900"))]))

dat

Solution

  • You can use an if_else like so:

    library(lubridate)
    library(dplyr)
    
    x <- seq(1,5)
    y <- c(dmy("04/02/1863", "29/10/1989", "16/03/2000", "14/05/2021", NA))
    dat <- tibble(x,y)
    
    dat %>% 
      mutate(y = if_else(y >= dmy("01/06/2020") | y < dmy("01/01/1900"), NA_Date_, y))
    #> # A tibble: 5 x 2
    #>       x y         
    #>   <int> <date>    
    #> 1     1 NA        
    #> 2     2 1989-10-29
    #> 3     3 2000-03-16
    #> 4     4 NA        
    #> 5     5 NA