Search code examples
rdatedplyrdata-cleaning

In R Convert to Date of several columns?


Can you give me a hand with the code below? I did try to find an answer to this but migth have missed, if there is one already sorry for your time.

I have a DataFrame like the exemple below. What I need to do is to convert all dt_ variables to date. I got it with mutate()/lapply one by one but I was looking for an automatic method. I am working in R.

co_cid Tipo  dt_notificacao co_uf_notificac~ co_uf_completo no_municipio_no~ dt_diagnostico_~
  <fct>  <fct> <fct>          <fct>            <fct>          <fct>            <fct>           
1 A90    Deng~ 01/10/2016     PE               PERNAMBUCO     Recife           23/01/2015      
2 A90    Deng~ 02/11/2016     PE               PERNAMBUCO     Recife           09/01/2015      
3 A90    Deng~ 01/11/2016     PE               PERNAMBUCO     Recife           12/12/2015      
4 A90    Deng~ 02/04/2016     PE               PERNAMBUCO     Recife           12/08/2015      
5 A90    Deng~ 01/08/2016     PE               PERNAMBUCO     Recife           12/01/2015      
6 A90    Deng~ 01/11/2016     PE               PERNAMBUCO     Recife           12/04/2015  

I got all the dt_, that should be dates with:

dt_vec <- nomes_colunas[(sapply(nomes_colunas, startsWith,prefix = "dt_"))]

Then I wanted to use it to convert all dt_ columns to Date and replace it in the original df. I tryed a for loop with mutate but the column name stays with the variable name. So I end up with only one 'coluna' variable with the Dates.

for (coluna in dt_vec) {
  df_dados <- df_dados %>% mutate(coluna, coluna = as.Date(coluna, format = "%d/%m/%Y"))
}

As for apply I found it hard to replace the values in the original df.

Thanks in advance!


Solution

  • If we want to convert several columns, use mutate with across

    library(lubridate)
    library(dplyr)#1.0.0  
    df_dados <- df_dados %>%
                    mutate(across(starts_with('dt_', dmy)))
    

    In the earlier versions of dplyr, mutate_at can be used

    df_dados <- df_dados %>%
                  mutate_at(vars(starts_with('dt_')), dmy)