Search code examples
rif-statementlapplymissing-data

Applying ifelse in multiple variables / columns for replacing 99 and 999 to NA


I have a dataframe with some columns where 99 should be considered as missing values (NA) and other columns where 999 was the value given for this purpose.

dat$variable1 <- ifelse(dat$variable1 == 99, NA, dat$variable1)
dat$variable2 <- ifelse(dat$variable2 == 99, NA, dat$variable2)
dat$variable3 <- ifelse(dat$variable3 == 99, NA, dat$variable3)
dat$variable4 <- ifelse(dat$variable4 == 99, NA, dat$variable4)
dat$variable5 <- ifelse(dat$variable5 == 999, NA, dat$variable5)
dat$variable6 <- ifelse(dat$variable6 == 999, NA, dat$variable6)
dat$variable7 <- ifelse(dat$variable7 == 999, NA, dat$variable7)

I'd like to find a better way to do that, because sometimes we can have many many columns to deal with. I don't know how to loop over the specific variables that I should replace these values for NA and I'm not aware of a package that could help me with that (I'm a beginner in R).

EDIT: I have to apologise for a mistake I made in my question. I firstly posted dat$variable1 <- ifelse(dat$variable1 == 99, NA, dat$EC), keeping "dat$EC" in all lines of code. Thank you all for the answers.


Solution

  • Consider running ifelse on a block of columns since it works on vectors and matrices:

    var_99 <- c("variable1", "variable2", "variable3", "variable4")
    var_999 <- c("variable5", "variable6", "variable7")
    
    dat[var_99] <- ifelse(dat[var_99] == 99, NA, dat$EC)
    dat[var_999] <- ifelse(dat[var_999] == 999, NA, dat$EC)
    

    For more than one variable replacement, coerce the no argument to matrix:

    dat[var_99] <- ifelse(dat[var_99] == 99, NA, as.matrix(dat[var_99]))
    dat[var_999] <- ifelse(dat[var_999] == 999, NA, as.matrix(dat[var_99]))