I have a table like the following:
date user X1 X2 X3
1/1 1 0 3 34
2/1 1 0 7 65
3/1 1 0 0 0
4/1 1 25 4 65
1/1 2 285 0 0
2/1 2 0 0 0
3/1 2 0 54 0
4/1 2 0 0 0
How can I use dplyr
to select the users that have non-zero data only at the last available date (4/1) for all Xs ?? In this case user 2
should be removed. Thanks
Another option using any
with c_across
to check if the values are 0 and the last row_number
like this:
library(dplyr)
df %>%
group_by(user) %>%
filter(any(c_across(starts_with("X")) != 0 & row_number() == n()))
#> # A tibble: 4 × 5
#> # Groups: user [1]
#> date user X1 X2 X3
#> <chr> <int> <int> <int> <int>
#> 1 1/1 1 0 3 34
#> 2 2/1 1 0 7 65
#> 3 3/1 1 0 0 0
#> 4 4/1 1 25 4 65
Created on 2023-03-15 with reprex v2.0.2