I've got a dataset of species observations over time and I am trying to calculate observation dates based on the max value of criteria:
Df <- data.frame(Sp = c(1,1,2,2,3,3),
Site = c("A", "B", "C", "D"),
date = c('2021-1-1','2021-1-2','2021-1-3','2021-1-4','2021-1-5','2021-1-6', "2021-03-01","2021-03-05")
N = c(2,5,9,4,14,7,3,11)
I want to create a new column called Nmax that showing the in which date the value of N for a Sp on a given Site was max, so the column would look something like this:
Dmax=c("2021-1-2", "2021-1-2", '2021-1-2', '2021-1-2', '2021-1-5', '2021-1-5', "2021-03-05","2021-03-05")
So Dmax would show that for Sp 1 in site A the date in which N was max was "2021-1-2" and so on.
I've tried grouping by Site, Sp, and date and using mutate together which.max(N) but didn't work. I'd like to keep all my rows.
Any help is welcome.
Thanks!
From your desired output, it seems like you want the max date regardless of site. Just group by site. Also, your sample data only has 6 rows for Sp instead of 8 so I just assumed a 4th Sp
Df |>
group_by(Sp) |>
mutate(Dmax = date[which.max(N)])