Search code examples
rdataframeselecttime-series

group a dataframe and keep rows with group-wise maxima


I am currently trying to create a long timeseries from multiple different measurements.

Example data df:

df <- structure(list(Site = c(1L, 1L, 1L, 2L), Result = c(0.51, 0.55, 
1.2, 4), Date = c("01.01.1999", "01.01.1999", "02.01.1999", "02.01.1999"
), Priority = c(1L, 2L, 4L, 1L)), class = "data.frame", row.names = c(NA, 
4L))

My goal is to create timeseries for every site as complete as possible. How can I remove entries on the same site and date with lower priority (higher priority value)? Or how can I only pick one value for each site and date with the lowest priority value if there are multiple?

I am at a bit of a loss as I am not sure which way to approach this problem.

Sorry for my very vague description. I hope someone can lead me in the right way.

Thank you very much in advance!

PS: I am using tidyverse


Solution

  • like this?

    library(dplyr)
    
    df |> 
      group_by(Site, Date) |>
      slice_max(Priority)