I have to create a partial data set containing only the 20 days with the highest daily mean air temperature values for each year. My dataset looks like this:
date | mean |
---|---|
1997-07-15 | 27.05292 |
1997-07-17 | 26.86542 |
1997-06-21 | 26.10958 |
1997-07-16 | 26.05833 |
1997-07-14 | 26.02500 |
1997-06-25 | 25.80125 |
1997-07-18 | 25.36208 |
1997-06-22 | 25.18875 |
1997-06-29 | 24.72333 |
1997-06-30 | 24.71000 |
...
I tried to use the code bellow, but this one only filters the maximum from every year and creates a dataframe with 20 rows - but I need the Top 20 mean values from every year (1997 – 2010). I use the class data.frame btw. I would be so grateful if anyone can help me, I just can't figure it out!
top_20_per_year <- daily_mean_temp_sorted %>%
slice_max(mean, n = 20) %>%
Example taking the top 2 mean
values by year
:
library(tidyverse)
df <- tribble(
~date, ~mean,
"1997-07-15", 27.05292,
"1997-07-17", 26.86542,
"1997-06-21", 26.10958,
"1997-07-16", 26.05833,
"1997-07-14", 26.02500,
"1998-06-25", 25.80125,
"1998-07-18", 25.36208,
"1998-06-22", 25.18875,
"1998-06-29", 24.72333,
"1998-06-30", 24.71000
)
df |>
mutate(date = ymd(date), year = year(date)) |>
slice_max(n = 2, order_by = mean, by = year)
#> # A tibble: 4 × 3
#> date mean year
#> <date> <dbl> <dbl>
#> 1 1997-07-15 27.1 1997
#> 2 1997-07-17 26.9 1997
#> 3 1998-06-25 25.8 1998
#> 4 1998-07-18 25.4 1998
Created on 2024-04-29 with reprex v2.1.0