Search code examples
rlubridaterfid

combine multiple rows into one time interval


I have a data frame which is made up of the date and time a number of tagged animals visited a feeder. it looks like this -

ID    date_time                 
A     2019-11-02 08:07:47    
B     2019-11-02 08:07:48
A     2019-11-02 08:07:49
A     2019-11-02 08:07:50
A     2019-11-02 08:09:12
A     2019-11-02 08:09:13
B     2019-11-02 08:09:17

I'd like to lump all data points recorded within n seconds of each other into a single row so that it looks like this -

ID     start_date_time.      end_date_time
A      2019-11-02 08:07:47   2019-11-02 08:07:50
B      2019-11-02 08:07:48   2019-11-02 08:07:48
A      2019-11-02 08:09:12   2019-11-02 08:09:13
B      2019-11-02 08:09:17   2019-11-02 08:09:47

I have tried using Lubridate with no success.

Thanks


Solution

  • Possibly you can do it this way (N = 10s):

    library(tidyverse)
    
    dat %>%
      group_by(ID) %>%
      mutate(
        events = cut(date_time, '10 s', labels = F)
        ) %>%
      group_by(events, add = T) %>%
      summarise(
        start_date_time = min(date_time), 
        end_date_time   = max(date_time)
        ) %>%
      ungroup() %>%
      select(-events)
    
    # # A tibble: 4 x 3
    #   ID    start_date_time     end_date_time      
    #   <chr> <dttm>              <dttm>             
    # 1 A     2019-11-02 08:07:47 2019-11-02 08:07:50
    # 2 A     2019-11-02 08:09:12 2019-11-02 08:09:13
    # 3 B     2019-11-02 08:07:48 2019-11-02 08:07:48
    # 4 B     2019-11-02 08:09:17 2019-11-02 08:09:17
    

    Data:

    structure(list(
      ID = c("A", "B", "A", "A", "A", "A", "B"),
      date_time = structure(
        c(
          1572678467,
          1572678468,
          1572678469,
          1572678470,
          1572678552,
          1572678553,
          1572678557
        ),
        class = c("POSIXct", "POSIXt"),
        tzone = ""
      )
    ),
    row.names = c(NA,-7L),
    class = "data.frame")