Search code examples
rdataframedatelubridate

r: why is character string showing days and not date when applying as.Date() and origin?


I have

> head(p, 10)
   date_contact mr_daterd_fu1
1                  11.10.2012
2                            
3                            
4                            
5    13.12.1994              
6                            
7    20.03.2012    20.03.2012
8    25.08.1999              
9    25.05.2012    25.05.2012
10   19.10.2007 

I need to replace missing values in p$date_contact with p$mr_daterd_fu1 as in

fu1_date = ifelse(is.na(date_contact), 
                  as.Date(mr_daterd_fu1,  format = '%d.%m.%Y'),
                  as.Date(date_contact,  format = '%d.%m.%Y')))

But this gives

> head(p, 10)
   date_contact mr_daterd_fu1 fu1_date
1                  11.10.2012       NA
2                                   NA
3                                   NA
4                                   NA
5    13.12.1994                   9112
6                                   NA
7    20.03.2012    20.03.2012    15419
8    25.08.1999                  10828
9    25.05.2012    25.05.2012    15485
10   19.10.2007                  13805

And

> str(p)
'data.frame':   946 obs. of  3 variables:
 $ date_contact : chr  "" "" "" "" ...
 $ mr_daterd_fu1: chr  "11.10.2012" "" "" "" ...
 $ fu1_date     : num  NA NA NA NA 9112 ...

Why is p$fu1_date not displayed as.Date?

I tried

 p %>% mutate(mr_daterd_fu1 = as.Date(mr_daterd_fu1,  format = '%d.%m.%Y'),
         fu1_date = ifelse(is.na(date_contact), 
                    mr_daterd_fu1,
                    as.Date(date_contact,  format = '%d.%m.%Y', origin=mr_daterd_fu1)))

But that did not work.

Expected output:

   date_contact mr_daterd_fu1    fu1_date
1                  11.10.2012  2012.10.11
2                                      NA
3                                      NA
4                                      NA
5    13.12.1994                1994.12.13
6                                      NA
7    20.03.2012    20.03.2012  2012.03.20
8    25.08.1999                1999.08.25
9    25.05.2012    25.05.2012  2012.05.25
10   19.10.2007                2007.10.19

Data

p <- structure(list(date_contact = c("", "", "", "", "13.12.1994", 
"", "20.03.2012", "25.08.1999", "25.05.2012", "19.10.2007"), 
    mr_daterd_fu1 = c("11.10.2012", "", "", "", "", "", "20.03.2012", 
    "", "25.05.2012", "")), row.names = c(NA, 10L), class = "data.frame")

Solution

  • We can convert to Date class and use coalesce

    library(dplyr)
    p %>%
       mutate(across(c(date_contact, mr_daterd_fu1),
               as.Date, format = "%d.%m.%Y")) %>% 
       mutate(ful_date  = coalesce(date_contact, mr_daterd_fu1 ))
    

    -output

    #  date_contact mr_daterd_fu1   ful_date
    #1          <NA>    2012-10-11 2012-10-11
    #2          <NA>          <NA>       <NA>
    #3          <NA>          <NA>       <NA>
    #4          <NA>          <NA>       <NA>
    #5    1994-12-13          <NA> 1994-12-13
    #6          <NA>          <NA>       <NA>
    #7    2012-03-20    2012-03-20 2012-03-20
    #8    1999-08-25          <NA> 1999-08-25
    #9    2012-05-25    2012-05-25 2012-05-25
    #10   2007-10-19          <NA> 2007-10-19
    

    In general, it is better not to use ifelse with Date class