I have a DF that looks like this;
ID Station Date Value
111-001 Place1 2022-05-01 39.1
111-002 Place2 2022-05-01 0.01
222-001 Place1 2022-05-02 8.5
222-002 Place2 2022-05-02 40.3
I'd like to compare the similar IDs and keep the higher of the two. How is the best method to achieve this? I prefer using packages to base and any help is appreciated.
Use susbtr
to group_by
the 3 first numbers of ID, then use max
:
library(dplyr)
dat %>%
group_by(gp = substr(ID, 1, 3)) %>%
mutate(max = max(Value))
# A tibble: 4 × 6
# Groups: gp [2]
ID Station Date Value gp max
<chr> <chr> <chr> <dbl> <chr> <dbl>
1 111-001 Place1 2022-05-01 39.1 111 39.1
2 111-002 Place2 2022-05-01 0.01 111 39.1
3 222-001 Place1 2022-05-02 8.5 222 40.3
4 222-002 Place2 2022-05-02 40.3 222 40.3
In base R, use ave
:
with(dat, ave(Value, substr(ID, 1, 3), FUN = max))
#[1] 39.1 39.1 40.3 40.3