Search code examples
rdatetimedplyrfilteringslice

R- Filter by time closest to midnight


I have a series of GPS points which include date and time information, for overnight (4pm to 9am) locations. The dates have been adjusted so that any points crossing over midnight, will have the same date as the first night. I have several different individuals in the data set and I have several readings per night. I'm trying to filter the data to just one point per night which is closest to midnight.

The format of my date and time data is as follows, and there is also an ID column:

NightDate timeofday time NightDateTime
2019-03-22 19.47 19:28:00 22/03/2019 19:28:00
2019-03-22 21.88 21:53:00 22/03/2019 21:53:00
2019-03-22 2.77 02:46:00 22/03/2019 02:46:00
2019-03-23 20.53 20:32:00 23/03/2019 20:32:00
2019-03-23 23.07 23:04:00 23/03/2019 23:04:00
2019-03-23 1.67 01:40:00 23/03/2019 01:40:00
2019-03-23 4.28 04:17:00 23/03/2019 04:17:00
2019-03-24 19.57 19:34:00 24/03/2019 19:34:00
2019-03-24 23.37 23:22:00 24/03/2019 23:22:00
2019-03-24 2.05 02:03:00 24/03/2019 02:03:00
2019-03-24 4.8 04:48:00 24/03/2019 04:48:00

The code is:

Night <- read.csv("night_redate.csv")

Night.filter <- Night %>% 
  group_by(ID, NightDate) %>% 
  slice(which.min(abs(timeofday - 0)))

write.csv(Night.filter, "filtered_night.csv", row.names = FALSE)

The code I have tried nearly worked and it did filter the data to one reading a night, however, it doesn't treat 23.07 as being close to midnight (0) and is only including early morning times as being close to 0. Is there a way to adjust this code so that 1.00 and 23.00 for example are both considered to be close to midnight for the filtering?

I did also try to solve this originally using the standard date time format but the code was not filtering correctly. I have included what I tried below if it might be useful. The column NightDateTime is the date and time combined.

# Remove rows with NA in 'NightDate' column and group by ID and NightDateTime

Night.edit <- Night %>%
  filter(!is.na(DateTime)) %>%
  group_by(ID, NightDateTime) %>%
  mutate(
    # Convert NightDateTime to Date
    night_date = as.Date(NightDateTime),
    
    # Create Midnight Time
    midnight_time = as.POSIXct(paste(night_date, "00:00:00")),
    
    # Calculate Time Difference
    time_diff = difftime(midnight_time, NightDateTime),
    
    # Take Absolute Value
    time_diff_absolute = abs(as.numeric(time_diff))
  ) %>%
  # Filter for the observation closest to midnight
  filter(time_diff_absolute == min(time_diff_absolute))

Thanks for your help!

Edited to include more columns in table and original code


Solution

  • Does this give the expected output?

    library(tidyverse)
    
    df %>% 
      slice_min(as.numeric(if_else(time > hms("12:00:00"), 
                        hours(24) - time, 
                        time)),
                by = NightDate)
    
    # A tibble: 3 × 4
      NightDate  timeofday time       NightDateTime      
      <date>         <dbl> <Period>   <dttm>             
    1 2019-03-22      21.9 21H 53M 0S 2019-03-22 21:53:00
    2 2019-03-23      23.1 23H 4M 0S  2019-03-23 23:04:00
    3 2019-03-24      23.4 23H 22M 0S 2019-03-24 23:22:00