Search code examples
rdata.tablelubridateposixctdatetime-conversion

Convert data.table string column to POSIXct; round.POSIXt() returns a POSIXlt?


I have a date-time column stored as character in a data.table. When I convert to POSIXct and then try rounding to date-only, I get weird results.

library(data.table)
library(lubridate)

# suppose I have these dates, in a data.table
date_chr <- c("2014-04-09 8:37 AM", "2014-09-16 6:04 PM", 
              "2014-09-30 3:26 PM", "2014-11-13 12:47 PM",
              "2014-11-05 12:25 PM")
dat <- data.table(date_chr)

# I convert to POSIXct...
dat[, my_date := ymd_hm(date_chr)]

# ...and I want to round to date only, but this doesn't work
dat[, date_only := round(my_date, 'days')] # why does this return a list?
dat[, date_only := trunc(my_date, 'days')] # this too

class(dat$date_only) is list, and I get this warning message

# Warning message:
#   In `[.data.table`(dat, , `:=`(date_only, round(my_date, "days"))) :
#   Supplied 9 items to be assigned to 5 items of column 'date_only' (4 unused)

Meanwhile, this works fine!

dat_df <- data.frame(date_chr, stringsAsFactors = F)
dat_df$my_date <- ymd_hm(dat_df$date_chr)
dat_df$date_only <- round(dat_df$my_date, 'days')

class(dat_df$date_only) is POSIXlt, POSIXt, as desired.

My question is, why is this and how can I avoid the issue when using data.table? There are work-arounds, like truncating the time portion of date_chr before converting, but seems like round.POSIXt() ought to work.

Thanks for any thoughts.


Solution

  • Already pretty well answered in comments by @SymbolixAU.
    Addressing your question about data.frame/data.frame difference on that matter.
    Major difference comes from the fact that POSIXlt takes much more memory than POSIXct, and data.table do care about memory.

    object.size(Sys.time())
    #312 bytes
    object.size(as.POSIXlt(Sys.time()))
    #2144 bytes
    

    Important to know is that you can still use POSIXlt data type (and its methods) in data.table j argument, just make sure to convert it to POSIXct when assigning to a column.

    If for some reason you want to store POSIXlt in data.table... data.table does not support POSIXlt type the same way as data.frame. You can store POSIXlt in data.table but just wrap it into list, as any other non-atomic data type.