I have a date-time column stored as character in a data.table
. When I convert to POSIXct and then try rounding to date-only, I get weird results.
library(data.table)
library(lubridate)
# suppose I have these dates, in a data.table
date_chr <- c("2014-04-09 8:37 AM", "2014-09-16 6:04 PM",
"2014-09-30 3:26 PM", "2014-11-13 12:47 PM",
"2014-11-05 12:25 PM")
dat <- data.table(date_chr)
# I convert to POSIXct...
dat[, my_date := ymd_hm(date_chr)]
# ...and I want to round to date only, but this doesn't work
dat[, date_only := round(my_date, 'days')] # why does this return a list?
dat[, date_only := trunc(my_date, 'days')] # this too
class(dat$date_only)
is list
, and I get this warning message
# Warning message:
# In `[.data.table`(dat, , `:=`(date_only, round(my_date, "days"))) :
# Supplied 9 items to be assigned to 5 items of column 'date_only' (4 unused)
Meanwhile, this works fine!
dat_df <- data.frame(date_chr, stringsAsFactors = F)
dat_df$my_date <- ymd_hm(dat_df$date_chr)
dat_df$date_only <- round(dat_df$my_date, 'days')
class(dat_df$date_only)
is POSIXlt, POSIXt
, as desired.
My question is, why is this and how can I avoid the issue when using data.table
? There are work-arounds, like truncating the time portion of date_chr
before converting, but seems like round.POSIXt()
ought to work.
Thanks for any thoughts.
Already pretty well answered in comments by @SymbolixAU.
Addressing your question about data.frame/data.frame difference on that matter.
Major difference comes from the fact that POSIXlt
takes much more memory than POSIXct
, and data.table do care about memory.
object.size(Sys.time())
#312 bytes
object.size(as.POSIXlt(Sys.time()))
#2144 bytes
Important to know is that you can still use POSIXlt
data type (and its methods) in data.table j
argument, just make sure to convert it to POSIXct
when assigning to a column.
If for some reason you want to store POSIXlt in data.table... data.table
does not support POSIXlt type the same way as data.frame. You can store POSIXlt in data.table but just wrap it into list, as any other non-atomic data type.