I have a dataframe with multiple samples collected from different individuals looking like
ID | Subject | Week |
---|---|---|
ID01 | S01 | Week_2 |
ID02 | S01 | Week_4 |
ID03 | S01 | Week_5 |
ID04 | S02 | Week_3 |
ID05 | S03 | Week_2 |
ID06 | S03 | Week_4 |
ID07 | S04 | Week_1 |
ID08 | S04 | Week_4 |
ID09 | S04 | Week_5 |
I want to filter out the subjects and samples without both Week_4 and Week_5 collection time points using dplyr to have
ID | Subject | Week |
---|---|---|
ID01 | S01 | Week_2 |
ID02 | S01 | Week_4 |
ID03 | S01 | Week_5 |
ID07 | S04 | Week_1 |
ID08 | S04 | Week_4 |
ID09 | S04 | Week_5 |
at the end.
You may do this with all
in filter
for each Subject
:
library(dplyr)
keep_week <- c("Week_4", "Week_5")
df %>% filter(all(keep_week %in% Week), .by = Subject)
# ID Subject Week
#1 ID01 S01 Week_2
#2 ID02 S01 Week_4
#3 ID03 S01 Week_5
#4 ID07 S04 Week_1
#5 ID08 S04 Week_4
#6 ID09 S04 Week_5