Search code examples
rdataframedplyrtidyversesubset

How to do nested subsetting in R


In my DATA below, I'm trying to filter the Districts that have Status that includes both Monitor and Never and have a No Answer.

Then, in each of those Districts, I only want to filter the rows where Status is Monitor or Never and Answer is No.

Desired output should be only the 1st, the 3rd, the 10th, and the last row (see below). Is this possible?

I tried the following (without success):

library(dplyr)
  group_by(DATA, District) %>% 
  filter(Status %in% c("Monitor","Never")  & Answer == "No") %>% 
  ungroup()
DATA <- read.table(h=T, text="
District  Status   Answer
A         Monitor  No    #--> Filter this row
A         Never    Yes
A         Never    No    #--> Filter this row
A         Ever     No
B         Never    Yes
B         Never    No
B         Never    No
C         Former   No
C         Never    No
D         Never    No   #--> Filter this row
D         Monitor  Yes
D         Monitor  No   #--> Filter this row
")

Solution

  • For "includes both Monitor and Never", we need to use all(.) first, then we can continue with the rest of your filter.

    group_by(DATA, District) %>%
      filter(
        all(c("Monitor","Never") %in% Status),
        Status %in% c("Monitor", "Never"),
        Answer == "No"
      ) %>%
      ungroup()
    # # A tibble: 4 × 3
    #   District Status  Answer
    #   <chr>    <chr>   <chr> 
    # 1 A        Monitor No    
    # 2 A        Never   No    
    # 3 D        Never   No    
    # 4 D        Monitor No