I have
mydf1 <- data.frame(unique.id = c("aa", "bb", "aa", "cc", "bb"), date = c("1/2/21", "1/4/21", "1/8/21", "1/5/21", "12/5/20"), A = c(1, 1, 1, 1, 1))
I need to update the value of A such that for any instance of multiple rows with a given unique.ID (for example rows 1 & 3), the value A is updated to a new value (2) for the row with the later date. This would result in the output of the following
mydf2 <- data.frame(unique.id = c("aa", "bb", "aa", "cc", "bb"), date = c("1/2/21", "1/4/21", "1/8/21", "1/5/21", "12/5/20"), A = c(1, 2, 2, 1, 1))
Where row 3 has A = 2 because both rows 1 and 3 have "aa" and row 3 has the later date, and rows 2 & 5 share a common unique.id but row 2 has the later date.
You can solve it easily if you sort the dates and count the rows grouped by the id
library(tidyverse)
mydf1 <- data.frame(unique.id = c("aa", "bb", "aa", "cc", "bb"), date = c("1/2/21", "1/4/21", "1/8/21", "1/5/21", "12/5/20"), A = c(1, 1, 1, 1, 1))
mydf2 <- data.frame(unique.id = c("aa", "bb", "aa", "cc", "bb"), date = c("1/2/21", "1/4/21", "1/8/21", "1/5/21", "12/5/20"), A = c(1, 2, 2, 1, 1))
mydf1 %>%
mutate(date = lubridate::dmy(date)) %>%
group_by(unique.id) %>%
arrange(date) %>%
mutate(result = dplyr::row_number())
#> # A tibble: 5 x 4
#> # Groups: unique.id [3]
#> unique.id date A result
#> <chr> <date> <dbl> <int>
#> 1 bb 2020-05-12 1 1
#> 2 aa 2021-02-01 1 1
#> 3 bb 2021-04-01 1 2
#> 4 cc 2021-05-01 1 1
#> 5 aa 2021-08-01 1 2
Created on 2021-01-28 by the reprex package (v0.3.0)