Search code examples
rdateparsingtidyversereadr

Parsing custom Dates and Months in Roman Numerals with Tidyverse?


Tidyverse has the fabulous Readr and it has a wide variety of parse commands, such as parse_date, parse_*, parse_factor and guess_parser. I have a custom month-year format like the below in terms of Roman numerals such that

> emptyOffices$Month
[1] " II/90" " I/91"  " II/91" " I/92"  " II/92" " I/93"  " II/93"

> guess_parser(emptyOffices$Month)
[1] "character"

where I stands for January, II stands for February and so no. For example, II/90 stands for February 1990. The guess_parser guess the meaning of the month-year wrong. Perhaps, there is a tool by which I can define months to help the parser to understand this?

Does there exist some tool in some Tidyverse package to read custom dates like with Roman numerals?


Solution

  • There must be better tidy solution, but this one works:

    library(dplyr)
    foo <- c("II/90", "I/91", "II/91", "I/92", "II/92", "I/93", "II/93")    
    foo %>%
        tibble() %>%
        mutate(year     = gsub(".*/", "", .), 
               monthRom = as.roman(gsub("/.*", "", .))) %>%
        mutate(monthNum = as.numeric(monthRom)) %>%
        mutate(monthChr = month.abb[monthNum])
    # A tibble: 7 x 5
          .  year monthRom monthNum monthChr
      <chr> <chr>    <chr>    <dbl>    <chr>
    1 II/90    90       II        2      Feb
    2  I/91    91        I        1      Jan
    3 II/91    91       II        2      Feb
    4  I/92    92        I        1      Jan
    5 II/92    92       II        2      Feb
    6  I/93    93        I        1      Jan
    7 II/93    93       II        2      Feb
    

    Or you can simply do this:

    foo %>%
        gsub("/.*", "", .) %>%
        as.roman() %>%
        as.numeric() %>%
        month.abb[.]
    

    Use as.roman from utils to transform object into class roman, turn this object into numeric string and extract month from base month.abb.