So let's say I have some data like this:
ID value date
001 A 2015-12-06
001 A 2015-12-07
001 A 2015-12-08
002 B 2015-12-09
002 C 2015-12-10
003 A 2015-12-11
003 B 2015-12-12
002 B 2015-12-13
004 D 2015-12-13
004 R 2015-12-13
I want to find the value
that appears most frequently for each ID. But when there is a tie, take the most recent date's value.
Expected Output:
ID value
001 A
002 B
003 B
004 R
You may notice in the case of 004
there is the same date AND same ID during a tie. In this case, you can use the lower row entry.
You can use the following code:
library(dplyr)
df %>%
group_by(ID) %>%
mutate(n = n()) %>%
filter(date == max(date)) %>%
summarise(value = value[1])
#> # A tibble: 4 × 2
#> ID value
#> <int> <chr>
#> 1 1 A
#> 2 2 B
#> 3 3 B
#> 4 4 D
Created on 2022-07-02 by the reprex package (v2.0.1)