I have a dataframe "abnormal2" which contains multiple cols, I cut some cols as an example;
id class1 class2 class3 class4 class5
1 1 PATH PATH PATH PATH PATH
2 2 rLS PATH LB <NA>
3 3 PATH PATH PATH <NA> <NA>
4 4 PATH VUS VUS <NA> <NA>
5 5 PATH VUS VUS <NA> <NA>
6 6 PATH PATH VUS PATH <NA>
7 7 MPATH VUS VUS <NA> <NA>
8 8 VUS VUS VUS <NA> <NA>
9 9 PATH VUS VUS <NA> <NA>
10 10 PATH PATH <NA> <NA>
What I want to is replacing any cells that not matched a list of string (MPATH,VUS_LPATH,VUS_LB,PATH,VUS,LB,Normal) to NA. This is replacement is only for cols from class1 to class5; the results could be like this:
id class1 class2 class3 class4 class5
1 1 PATH PATH PATH PATH PATH
2 2 NA NA PATH LB <NA>
3 3 PATH PATH PATH <NA> <NA>
4 4 PATH VUS VUS <NA> <NA>
5 5 PATH VUS VUS <NA> <NA>
6 6 PATH PATH VUS PATH <NA>
7 7 MPATH VUS VUS <NA> <NA>
8 8 VUS VUS VUS <NA> <NA>
9 9 PATH VUS VUS <NA> <NA>
10 10 PATH PATH NA <NA> <NA>
I used the codes below, but it is not working:
sel <- grepl("class",names(abnormal2))
abnormal2[sel] <- data.frame(lapply(abnormal2[sel], function(x) gsub([^MPATH|^VUS\\_LPATH|^VUS\\_LB|^PATH|^VUS|^LB|^Normal]","", x)))
If your string matches are exact (rather than requiring regex) then, using your idea as a basis, the following will work.
sel <- grepl("class",names(abnormal2))
matches <- c("MPATH", "VUS_LPATH", "VUS_LB", "PATH", "VUS", "LB", "Normal")
abnormal2[sel] <- data.frame(lapply(abnormal2[sel], function(x) {
x[!x %in% matches] <- NA
x
}), stringsAsFactors = F)