Replacing value in a row based on value in another row with same identifier using R

I am trying to solve a problem in R that seems simple, but I can't work it out.

I have the following data:

tmp

  town_id city_ id flag
1   10500      111    1           
2   15300     1110    1           
3    9400    11100    1           
4    9400    11101    0           
5    9600    11102    0          
6    9800    11103    0

There is a duplicate town_id, and I would like to remove it while assigning the highest value in flag. That is, I would like to have:

  town_id city_ id flag
1   10500      111    1           
2   15300     1110    1           
3    9400    11100    1                     
4    9600    11102    0          
5    9800    11103    0

I tried to use the following dplyr code, but it assigns a 1 to everything:

tmp_2<-tmp %>% group_by(town_id) %>% mutate(flag=max(flag))

tmp_2

  town_id city_ id  flag 
1 10500   111          1           
2 15300   1110         1           
3 9400    11100        1           
4 9400    11101        1           
5 9600    11102        1           
6 9800    11103        1

Could someone please tell me what I am doing wrong?

Thank you.

Solution

I suppose this is what you want - filter out the duplicate:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
tmp <- structure(list(town_id = c(10500L, 15300L, 9400L, 9400L, 9600L, 9800L), 
                      city_id = c(111L, 1110L, 11100L, 11101L, 11102L, 11103L), 
                      flag = c(1L, 1L, 1L, 0L, 0L, 0L)), 
                 class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6"))
tmp_2 <- tmp %>% group_by(town_id) %>% dplyr::filter(flag==max(flag)) %>% ungroup()
tmp_2
#> # A tibble: 5 x 3
#>   town_id city_id  flag
#>     <int>   <int> <int>
#> 1   10500     111     1
#> 2   15300    1110     1
#> 3    9400   11100     1
#> 4    9600   11102     0
#> 5    9800   11103     0

Edit: In case there are more than one per group with the maximum value of flag and you only want to retain the first one, you could do:

tmp_2 <- tmp %>% group_by(town_id) %>%  
    dplyr::filter(flag==max(flag)) %>% 
    filter(row_number()==1) %>% ungroup()

^{Created on 2020-05-22 by the reprex package (v0.3.0)}