Search code examples
rif-statementdplyrcase

How to conditionally replace values with TRUE if the two variables equal NA?


How can I write the following condition (preferably using case_when)? I need to replace NAs with TRUE if two variables (i.e., name and salary) are NA.

df <- data.frame(
   id = c (1:5), 
   name = c("Rick","Dan","Michelle",NA,"Gary"),
   salary = c(623.3,515.2,611.0,NA,843.25), 
   start_date = as.Date(c(NA, "2013-09-23", "2014-11-15", "2014-05-11",
      "2015-03-27")),
   stringsAsFactors = FALSE
)

My desired output looks like this:

id     name salary start_date
1  1     Rick 623.30 NA
2  2      Dan 515.20 2013-09-23
3  3 Michelle 611.00 2014-11-15
4  4     TRUE   TRUE 2014-05-11
5  5     Gary 843.25 2015-03-27

The condition would be something like this but it does not replace the values in their columns:

case_when(is.na(df$name)& is.na(df$salary) ~TRUE)

Thanks in advance for your input.


Solution

  • In dplyr mutate/replace several columns on a subset of rows and also shown in the Note at the end mutate_cond provides a simple way to implement this. The arguments are the data frame, the condition and the assignments to make on the rows for which the condition holds.

    library(dplyr)
    
    df %>% mutate_cond(is.na(name) & is.na(salary), name = "TRUE", salary = "TRUE")
    

    giving

      id     name salary start_date
    1  1     Rick  623.3       <NA>
    2  2      Dan  515.2 2013-09-23
    3  3 Michelle    611 2014-11-15
    4  4     TRUE   TRUE 2014-05-11
    5  5     Gary 843.25 2015-03-27
    

    Note

    # see link above
    mutate_cond <- function(.data, condition, ..., envir = parent.frame()) {
      condition <- eval(substitute(condition), .data, envir)
      .data[condition, ] <- .data[condition, ] %>% mutate(...)
      .data
    }