I'd like to remove rows with NA
in any one of the columns in a vector of column names.
Here's a simplified example with just a couple of columns.
data <- structure(list(sample_id = c("2023.01.12_2", "2023.01.12_27",
"2023.01.12_27", "2023.01.12_3", "2023.01.12_27", "2023.01.12_27",
"2023.01.12_4", "2023.01.12_27", "2023.01.12_27", "2023.01.12_5"
), group = c("Unedited", "Rob", "Rob", "Partial_promoter", "Rob",
"Rob", "Promoter_and_ATG", "Rob", "Rob", "ATG"), day = c(6, NA,
NA, 6, NA, NA, 6, NA, NA, 6), x = c(114.243333333333, 115.036666666667,
115.073333333333, 114.41, 116.11, 116.163333333333, 113.426666666667,
116.15, 117.253333333333, 113.46)), row.names = c(NA, -10L), class = "data.frame")
cols <- c("group", "day")
I've tried a few ways, but can't get it working. This one below seems to work.
data %>%
filter(across(.cols = cols, .fns = ~ !is.na(.x)))
But when I try reversing it, to select the columns that are NA
(for QC purposes I want to keep them, but just separately) I get nothing:
data %>%
filter(across(.cols = cols, .fns = ~ is.na(.x)))
Any ideas?
You could use drop_na
and any_of
based on the columns you mentioned. Here is some reproducible code:
cols <- c("group", "day")
library(tidyr)
data |>
drop_na(any_of(cols))
#> sample_id group day x
#> 1 2023.01.12_2 Unedited 6 114.2433
#> 2 2023.01.12_3 Partial_promoter 6 114.4100
#> 3 2023.01.12_4 Promoter_and_ATG 6 113.4267
#> 4 2023.01.12_5 ATG 6 113.4600
Created on 2023-01-16 with reprex v2.0.2