Search code examples
rdatetimedate-manipulation

Mutate and format multiple date columns


I have a tibble containing some date columns formatted as strings:

library(tidyverse)

df<-tibble(dates1 = c("2020-08-03T00:00:00.000Z", "2020-08-03T00:00:00.000Z"),
           dates2 = c("2020-08-05T00:00:00.000Z", "2020-08-05T00:00:00.000Z"))

I want to convert the strings from YMD-HMS to DMY-HMS. Can someone explain to me why this doesn't work:

df %>% 
  mutate_at(vars(starts_with("dates")), as.Date, format="%d/%m/%Y %H:%M:%S")

Whereas this does?

df %>% mutate(dates1 = format(as.Date(dates1),  "%d/%m/%Y %H:%M:%S")) %>% 
  mutate(dates2 = format(as.Date(dates2),  "%d/%m/%Y %H:%M:%S"))

Finally, is it possible to assign these columns as 'datetime' columns (e.g. dttm) rather than chr once the date formatting has taken place?


Solution

  • The format argument which you are passing is for as.Date whereas what you really want is to pass it for format function. You can use an anonymous function for that or use formula syntax.

    library(dplyr)
    df %>% 
      mutate(across(starts_with("dates"), ~format(as.Date(.), "%d/%m/%Y %H:%M:%S")))
    
    # A tibble: 2 x 2
    #  dates1              dates2             
    #  <chr>               <chr>              
    #1 03/08/2020 00:00:00 05/08/2020 00:00:00
    #2 03/08/2020 00:00:00 05/08/2020 00:00:00
    

    To represent data as date or datetime R uses standard way of representing them which is Y-M-D H:M:S, you can change the representation using format but then the output would be character as above.

    df %>% 
      mutate(across(starts_with("dates"), lubridate::ymd_hms))
    
    #  dates1              dates2             
    #  <dttm>              <dttm>             
    #1 2020-08-03 00:00:00 2020-08-05 00:00:00
    #2 2020-08-03 00:00:00 2020-08-05 00:00:00