Search code examples
rdplyr

Selection of days in which one hour meets the condition


I have data with this structure (real data is more complicated):

df1 <- read.table(text = "DT   odczyt.1 odczyt.2
'2023-08-14 00:00:00'   362 1.5
'2023-08-14 23:00:00'   633 4.3
'2023-08-15 05:00:00'   224 1.6
'2023-08-15 23:00:00'   445 5.6
'2023-08-16 00:00:00'   182 1.5
'2023-08-16 23:00:00'   493 4.3
'2023-08-17 05:00:00'   434 1.6
'2023-08-17 23:00:00'   485 5.6
'2023-08-18 00:00:00'   686 1.5
'2023-08-18 23:00:00'   487 6.8
'2023-08-19 00:00:00'   566 1.5
'2023-08-19 05:00:00'   278 7.9
'2023-08-19 17:00:00'   561 11.5
'2023-08-19 18:00:00'   365 8.5
'2023-08-19 22:00:00'   170 1.8
'2023-08-19 23:00:00'   456 6.6
'2023-08-20 00:00:00'   498 1.5
'2023-08-20 03:00:00'   961 1.54
'2023-08-20 05:00:00'   397 1.6
'2023-08-20 19:00:00'   532 6.6
'2023-08-20 23:00:00'   493 3.8
'2023-08-21 01:00:00'   441 9.2
'2023-08-21 07:00:00'   793 8.5
'2023-08-21 13:00:00'   395 5.5", header = TRUE) %>% 
  mutate (DT = as.POSIXct(DT))

I am selecting 3 hours for which "odczyt.1" has the maximum value (I sort "odczyt.1" from the largest values and select the 3 largest values). I leave the days in which these hours occurred and delete the rest of the lines. For the given data these are:

2023-08-20 03:00:00 (961)

2023-08-21 07:00:00 (793)

2023-08-18 00:00:00 (686)

therefore the expected result is:

> df1
                    DT odczyt.1 odczyt.2
1  2023-08-18 00:00:00      686     1.50
2  2023-08-18 23:00:00      487     6.80
3  2023-08-20 00:00:00      498     1.50
4  2023-08-20 03:00:00      961     1.54
5  2023-08-20 05:00:00      397     1.60
6  2023-08-20 19:00:00      532     6.60
7  2023-08-20 23:00:00      493     3.80
8  2023-08-21 01:00:00      441     9.20
9  2023-08-21 07:00:00      793     8.50
10 2023-08-21 13:00:00      395     5.50

Solution

  • Here's a thought:

    df1 |>
      mutate(
        day = as.Date(trunc(DT)),
        rnk = rank(-odczyt.1, ties="first")
      ) |>
      filter(.by = day, any(rnk <= 3))
    #                     DT odczyt.1 odczyt.2        day rnk
    # 1  2023-08-18 00:00:00      686     1.50 2023-08-18   3
    # 2  2023-08-18 23:00:00      487     6.80 2023-08-18  11
    # 3  2023-08-20 00:00:00      498     1.50 2023-08-20   8
    # 4  2023-08-20 03:00:00      961     1.54 2023-08-20   1
    # 5  2023-08-20 05:00:00      397     1.60 2023-08-20  17
    # 6  2023-08-20 19:00:00      532     6.60 2023-08-20   7
    # 7  2023-08-20 23:00:00      493     3.80 2023-08-20  10
    # 8  2023-08-21 01:00:00      441     9.20 2023-08-21  15
    # 9  2023-08-21 07:00:00      793     8.50 2023-08-21   2
    # 10 2023-08-21 13:00:00      395     5.50 2023-08-21  18