Trying to convert multiple character variables to datetimes. Simplified example:
#create df/tibble with two "datetime" columns still as character
df=tibble(date1=c("2013-11-26 00:10:12.536","2013-11-26 23:04:32.512","2014-02-19 23:34:44.459"),
date2=c("2013-11-26 07:06:40.720","2013-11-27 07:09:50.552","2014-02-20 08:00:03.975"))
datetimeFormat="%Y-%m-%d %H:%M:%OS"
#OK: converting a single var using $
df_temp=df
df_temp$date1=as_datetime(df_temp$date1,format = datetimeFormat)
#not OK: converting a single var using indexing (presumably because df_temp[,"date1"] is still a tibble)
df_temp=df
df_temp[,"date1"]=as_datetime(df_temp[,"date1"],format = datetimeFormat)
#also not OK: converting multiple variables in one go
datetimeVars=c("date1","date2")
df_temp=df
df_temp[,datetimeVars]=as_datetime(df_temp[,datetimeVars],format = datetimeFormat)
How can I convert multiple character columns to datetime at once, specifically using a variable containing the variable names (like datetimeVars above)?
Some context:
df=read_csv("myFile.csv",col_types=cols(datetimeVars=col_datetime(format=datetimeFormat)))
I also can't specify/hardcode variable type for each relevant variable like cols(date1=col_datetime(),date2=col_datettime, date3=...)
because the number of datetime variables isn't known ahead of timeSo currently stuck at both the levels of import (read_csv) and conversion (as_datetime). Suggestions welcome.
Tackling the conversion part since the import highly depends on the files and the included formats.
Convert to date
class using as.POSIXct
(keep in mind that the date
class is always shown in the format printed but keeps more info in the class object - see Reading below).
library(dplyr)
datetimeVars <- c("date1", "date2")
df_date <- df %>%
summarise(across(all_of(datetimeVars), as.POSIXct))
df_date
# A tibble: 3 × 2
date1 date2
<dttm> <dttm>
1 2013-11-26 00:10:12 2013-11-26 07:06:40
2 2013-11-26 23:04:32 2013-11-27 07:09:50
3 2014-02-19 23:34:44 2014-02-20 08:00:03
or with column names matching a starting pattern (starts_with()
)
datetimeVars <- c("date")
df_date <- df %>%
summarise(across(starts_with(datetimeVars), as.POSIXct))
df_date
# A tibble: 3 × 2
date1 date2
<dttm> <dttm>
1 2013-11-26 00:10:12 2013-11-26 07:06:40
2 2013-11-26 23:04:32 2013-11-27 07:09:50
3 2014-02-19 23:34:44 2014-02-20 08:00:03
Reading your desired format from date
class with strftime
df_date %>%
summarise(across(starts_with("date"), strftime, format="%Y-%m-%d %H:%M:%OS3"))
# A tibble: 3 × 2
date1 date2
<chr> <chr>
1 2013-11-26 00:10:12.536 2013-11-26 07:06:40.720
2 2013-11-26 23:04:32.512 2013-11-27 07:09:50.552
3 2014-02-19 23:34:44.459 2014-02-20 08:00:03.974