Search code examples
rdatereadr

Reading mixed-format dates


Where a column of dates contains two distinct formats, is there a way to parse both such that I get 2 rows of 2012-04-01 rather than the result below?

# A tibble: 2 x 1
        date
      <date>
1 2012-04-01
2         NA

**Example Code**
library(tidyverse)
data <- read_csv("date\n01/04/12\n41000", col_types = cols_only("date" = col_date("%d/%m/%y")))
head(data)

Solution

  • janitor::convert_to_date handles a mix of Excel numeric values and date-formatted dates.

    library(tidyverse)
    library(janitor)
    data <- read_csv("date,text\n01/04/12,a\n41000,b\n02/02/13,c")
    
    data %>%
          mutate(clean_date = convert_to_date(date,
                                              character_fun = lubridate::dmy))
    #> # A tibble: 3 x 3
    #>   date     text  clean_date
    #>   <chr>    <chr> <date>    
    #> 1 01/04/12 a     2012-04-01
    #> 2 41000    b     2012-04-01
    #> 3 02/02/13 c     2013-02-02
    

    Self-promotion disclaimer: I authored this package. But this function is built for precisely this use case so feels relevant to post here.