select a grouped row based on another variable

I have a data set where for each region there is a value pos, but some regions don't have a value of 1.

region <- c('A','A','B','B','C','D','D','E')
pos <- c(0,1,0,1,0,0,1,0)
df <- data.frame(region, pos)

For each region, I would like to select the row where the value of pos is 1. If there is a region where there is no pos value of 1, I would like to select the 0 row. So in essence I would like to have a resulting data set that has one row for each region but for the regions where this is a 1, I would like to select that row. There is never more than 2 rows for each region.

The result should look like this:

Thanks in advance for your help.

Solution

If you're certain all values are 0s and 1s, you might be able to get away with:

library(dplyr)
df %>%
  slice_max(pos, by = region, with_ties = FALSE)
#   region pos
# 1      A   1
# 2      B   1
# 3      C   0
# 4      D   1
# 5      E   0

(You can verify this returns at most one row per region by using bind_rows(df, df) instead of df, just to demonstrate.)

If that's not safe, though, then something like

df %>%
  group_by(region) %>%
  filter(row_number() == c(which(pos == 1), which(pos == 0))[1]) %>%
  ungroup()
# # A tibble: 5 × 2
#   region   pos
#   <chr>  <dbl>
# 1 A          1
# 2 B          1
# 3 C          0
# 4 D          1
# 5 E          0

In this case, if a region has neither pos == 1 nor pos == 0 then the region will go away.