Search code examples

Create variable that captures if there are missing fields in 4 string variables

I am creating dummy variables where missing values are 1 and non-missing values are 0. The missing values are NA, i.e.:


My code for one variable at a time successfully created the dummy variable:


#create new dummy variable
df <- mutate(df, newvar = ifelse(, 1,0))

sum(df$newvar == 1)

I have 4 string variables and want to create a new dummy variable where missing values in any of the variables are 1, and non-missing values are 0. I tried reusing the above code:

mylist <- c("var1", "var2", "var3", "var4")

for(i in mylist){
  df <- mutate(df, newvar = ifelse(, 1,0))

I know that I am incorrectly using the for loop, but is this the correct approach, or should I be doing something different?


  • We can use mutate with across

    library(dplyr) # version >= 1.0.0  
    df <- df %>%
              mutate(across(all_of(mylist), ~ +(, .names = '{col}_newvar'))

    if we have an earlier version, use mutate_at

    df %>%
       mutate_at(vars(mylist), ~  +(

    If we need to create a new column that flags if there are any missing value in those columns in the 'mylist'

    df1 <- df %>%
        mutate(newvar = +(rowSums(, all_of(mylist)))) > 0))