In my data, I need to remove the rows if the values in three columns (V1, V2, V3) are either a combination of 12 and NAs (like row 2) or all three of them equal 12 (like row 5). Please note that if all values equal NA (like row 3) it should remain in the data.
df <- data.frame(
"V1" = c(NA, NA, NA, 12, 12),
"V2" = c(55, NA, NA, 14, 12),
"V3" = c(21, 12, NA, NA, 12),
"V4" = c(NA, 32, NA, NA, NA),
"V5" = c(NA, NA, 18, NA, NA)
)
V1 V2 V3 V4 V5
1 NA 55 21 NA NA
2 NA NA 12 32 NA
3 NA NA NA NA 18
4 12 14 NA NA NA
5 12 12 12 NA NA
I would like the following result:
V1 V2 V3 V4 V5
1 NA 55 21 NA NA
3 NA NA NA NA 18
4 12 14 NA NA NA
Thanks in advance for your help.
First set a col
variable storing the target column names. The total number of records being NA
or 12
should match the length
of col
.
col <- c("V1", "V2", "V3")
df[apply(df[, col], 1, \(x) sum((is.na(x) | x == 12), na.rm = T) != length(col)), ]
Or
df[rowSums(is.na(df[, col]) | df[, col] == 12, na.rm = TRUE) < length(col), ]
Update: To remove rows that either include both 12
and NA
or all of the values equal 12
, use the following code:
df[apply(df[, col], 1, \(x) !((sum((is.na(x) | x == 12), na.rm = T) == length(col)) &
(sum(is.na(x)) >= 1 & sum(x == 12, na.rm = T) >= 1) |
sum(x == 12, na.rm = T) == length(col))), ]
V1 V2 V3 V4 V5
1 NA 55 21 NA NA
3 76 NA NA NA 12
4 12 14 NA NA NA