I would like conditional filtering, based on the group size.
Suppose I have a dataframe, which looks like:
data1 <- data.frame(
ID = c(1, 1, 1, 3, 3, 5, 6),
town = c("Town A", "Town A", "Town B", "Town A", "Town C", "Town B", "Town A"),
place = c("A", "B", "A", "B", "C", "A", "B"),
place1 = c("A", "c", "A", "B", "C", "A", "D"),
test = c("G", "B", "A", "B", "C", "A", "B"),
test1 = c("G", "B", "A", "B", "d", "A", "B")
I would want to keep one town each ID, based on conditional filtering place == place1 and if the group size is still bigger than I want to filter test == test1.
I've tried something like:
data1 %>%group_by(ID) %>%
filter(if (n() >= 2) place == place1 else test == test1) %>%
filter(n() == 1) %>%
ungroup()
But the ifelse does not work, as group 1 and 3 are missing.
Sort your data by your conditions (descending, so that TRUE comes before FALSE), and then slice 1 row per group:
data1 |>
arrange(ID, desc(place == place1), desc(test == test1)) |>
slice(1, .by = ID)
# ID town place place1 test test1
# 1 1 Town A A A G G
# 2 3 Town A B B B B
# 3 5 Town B A A A A
# 4 6 Town A B D B B
Do note that if there are ties (like rows 1 and 3 in your original data), this will probably keep the first one, but I wouldn't count on it for sure.