Delete all rows based on corresponding values in multiple columns

Edited question:

I would like to subset/filter a new dataframe based on multiple conditions. I tried the following code mentioned here (Subset data frame based on multiple conditions) and (Remove group from data.frame if at least one group member meets condition)

A small portion of total database:

df<- structure(list(pat_id = c(10302, 10302, 10302, 
                          10482, 10482,10482,
                          10613, 10613, 10613, 
                          16190, 16190, 16190, 
                          16220, 16220,16220, 16220, 16220, 16220, 16220, 16220), 
               date = c("2014-04-22","2018-12-13", "2020-07-27", "2019-07-15", "2019-09-19", "2019-09-23", 
                         "2015-09-29", "2015-10-06", "2015-11-20", "2013-07-08", "2018-01-30", 
                         "2020-01-09", "2016-06-15", "2018-02-23", "2019-02-14", "2019-08-09", 
                         "2020-03-02", "2020-07-03", "2020-11-09", "2020-12-16"), 
               number = c(1,2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5, 6, 7, 8), 
               col1 = c(0,1, 1, 2, 4, 4, 9, 3, 1, 0, 1, 1, 9, 9, 9, 9, 9, 9, 9, 9), 
               col2 = c(NA_real_,NA_real_, NA_real_, 0, 1, NA_real_, NA_real_, NA_real_, 
                        NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
                        NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), 
               col3 = c(NA_real_,NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
                        NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
                        NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), 
                class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L), groups = structure(list(
               pat_id = c(10302, 10482, 10613, 16190, 16220), .rows = structure(list(
                        1:3, 4:6, 7:9, 10:12, 13:20), ptype = integer(0), class = c("vctrs_list_of", 
                        "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
                        ), row.names = c(NA, -5L), .drop = TRUE))

I want to create a new dataframe based on the following conditions.

If the number is 1 or 2 AND col1, col2 or col3 is 1, then delete all the rows with the corresponding id value.

Desired output:

id        date    number    col1     col2     col3
10613      ..      1         9        NA       NA
10613      ..      2         3        NA       NA
10613      ..      3         1        NA       NA
etc

I've tried df1 <- df %>% group_by(pat_id) %>% filter(any(!(number <= 2 & (col1 == 1 | col2==1 | col3==1))))

But this does not seem to work. Could it be because of the class/structure of the dataframe? I cant figure it out. If i create a 'dummy' dataframe with similar columns this code does work. But not on the big dataset.

Any tips?

Solution

First of all, make sure your number columns are numeric. After that you can group_by per id and filter if all numbers are true based on your condition like this:

library(dplyr)

df %>%
  group_by(id) %>%
  filter(all(number > 1))
#> # A tibble: 3 × 2
#> # Groups:   id [2]
#>   id    number
#>   <chr>  <dbl>
#> 1 12         2
#> 2 13         2
#> 3 13         3

^{Created on 2023-08-16 with reprex v2.0.2}

Data used:

id <- c('10','10','10','11', '11', '12', '13', '13', '14', '15', '15')
number <- c(1, 2,3, 1, 2, 2, 2, 3,1 ,1,2)
df <- data.frame(id, number)