Search code examples
rdatetimerounding

Problems with rounding datetime


I am having problems rounding the datetime columns in my dataset. I need to round a dataset to the nearest half an hour so I can use it with another dataset that have been already rounded.

I have tried the following:

library(dplyr)
library(lubridate)
library(tidyr)

# Create a datetime column
df$datetime <- with(df, ymd_hm(paste(year, month, day, hour, minute, sep = "-")))

# Round to nearest 0:30:00
df$rounded_datetime <-as.POSIXlt(round(as.double(df$datetime)/(30*60))*(30*60),origin=(as.POSIXlt('1970-01-01')))

the problem is that I get these values:

enter image description here

Why my datetime and rounded time don't match?


Solution

  • Probably related to your timezone. See this post for details. The default timezone for the ymd_hm function is "UTC".

    # Create a datetime column
    df$datetime <- with(df, 
       ymd_hm(paste(year, month, day, hour, minute, sep = "-"), 
       tz=""))  # Add this.
    
    # Round to nearest 0:30:00
    df$rounded_datetime <- as.POSIXlt(round(as.double(df$datetime)/(30*60))*(30*60),
       origin=(as.POSIXlt('1970-01-01')))
    
    df
        year month day hour minute            datetime    rounded_datetime
      1 2019     4  25    8     19 2019-04-25 08:19:00 2019-04-25 08:30:00
      2 2019     4  25    8     49 2019-04-25 08:49:00 2019-04-25 09:00:00
      3 2019     4  25    9     19 2019-04-25 09:19:00 2019-04-25 09:30:00
      4 2019     4  25    9     48 2019-04-25 09:48:00 2019-04-25 10:00:00
      5 2019     4  25   10     18 2019-04-25 10:18:00 2019-04-25 10:30:00
    

    df <- structure(list(year = c(2019, 2019, 2019, 2019, 2019), month = c(4, 
    4, 4, 4, 4), day = c(25, 25, 25, 25, 25), hour = c(8, 8, 9, 9, 
    10), minute = c(19, 49, 19, 48, 18), datetime = structure(c(1556151540, 
    1556153340, 1556155140, 1556156880, 1556158680), class = c("POSIXct", 
    "POSIXt"), tzone = ""), rounded_datetime = structure(list(sec = c(0, 
    0, 0, 0, 0), min = c(30L, 0L, 30L, 0L, 30L), hour = c(8L, 9L, 
    9L, 10L, 10L), mday = c(25L, 25L, 25L, 25L, 25L), mon = c(3L, 
    3L, 3L, 3L, 3L), year = c(119L, 119L, 119L, 119L, 119L), wday = c(4L, 
    4L, 4L, 4L, 4L), yday = c(114L, 114L, 114L, 114L, 114L), isdst = c(0L, 
    0L, 0L, 0L, 0L), zone = c("HKT", "HKT", "HKT", "HKT", "HKT"), 
        gmtoff = c(28800L, 28800L, 28800L, 28800L, 28800L)), class = c("POSIXlt", 
    "POSIXt"), tzone = c("", "HKT", "HKST"), balanced = TRUE)), row.names = c(NA, 
    -5L), class = "data.frame")