Search code examples
rdatetimedplyrtidyverseposixct

How to Filter Datetime by 'N' minute intervals in R?


I have a datetime column that I'd like to filter down to only contain 15 minute intervals with the first row's datetime being the baseline. Observe:

        order_dates  
1     2022-08-14 12:15:10
2     2022-08-14 12:15:11
3     2022-08-14 12:15:13 
4     2022-08-14 12:20:10
5     2022-08-14 12:20:16
6     2022-08-14 12:20:14
7     2022-08-14 12:25:19
8     2022-08-14 12:25:12      
9     2022-08-14 12:25:20 
10     2022-08-14 12:30:23
11     2022-08-14 12:30:31      
12     2022-08-14 12:30:34 
13     2022-08-14 12:40:32
14     2022-08-14 12:40:52  
15     2022-08-14 12:40:51
16     2022-08-14 12:45:40
17     2022-08-14 12:45:45
18     2022-08-14 12:45:23    

I would like the final dataset to only contain rows that contain 15 minute time-intervals with 12:15 as the baseline. So, I would like my final dataset to only contain rows 1,2,3,10,11,12,16,17,18. I would also like the final times to have the seconds column removed as well. Also, the order_dates column is of the POSIXct class


Solution

  • Using dplyr and lubridate:

    library(dplyr)
    library(lubridate)
    
    df %>% 
      mutate(order_dates = format(order_dates ,format = "%Y-%m-%d %H:%M"),
             mins = lubridate::minute(order_dates)) %>% 
      filter(mins %in% c(15, 30, 45)) %>% 
      select(-mins)
    
    

    This gives:

    # A tibble: 9 × 1
      order_dates     
      <chr>           
    1 2022-08-14 12:15
    2 2022-08-14 12:15
    3 2022-08-14 12:15
    4 2022-08-14 12:30
    5 2022-08-14 12:30
    6 2022-08-14 12:30
    7 2022-08-14 12:45
    8 2022-08-14 12:45
    9 2022-08-14 12:45
    

    Data:

    structure(list(order_dates = structure(c(1660479310, 1660479311, 
    1660479313, 1660479610, 1660479616, 1660479614, 1660479919, 1660479912, 
    1660479920, 1660480223, 1660480231, 1660480234, 1660480832, 1660480852, 
    1660480851, 1660481140, 1660481145, 1660481123), class = c("POSIXct", 
    "POSIXt"), tzone = "UTC")), class = c("tbl_df", "tbl", "data.frame"
    ), row.names = c(NA, -18L))