Search code examples
rdate

Convert characters to dates without having some of them as NAs after the convertion


I have those data

    sg2<-structure(list(`Last prescription/or progression time (if progressed)` = c("23-11-2021", 
"28-09-2022", "45020", "45079", "23-05-2023"), time = c(3.682191781, 
9.008219178, 22.52054795, 20.02191781, 12.7890411), status = c(0, 
0, 0, 0, 0), `Number of bone lesions (<5 = “<5”, >=5 = “≥5”) (clear, multiple bone metastases ≥5 are defined)` = c(">=5", 
"<5", ">=5", ">=5", "<5")), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

and some my dates data are displayed as 44383 for example while other dates are loaded normally from my excel file. I try to convert them all to dates below but then I get dates as NAs

sg2$`Time of first prescription of denosumab (if enrolled in QL1206 and JMT, this time is the time of first use of XGEVA after leaving the group)` <- as.Date(as.numeric(sg2$`Time of first prescription of denosumab (if enrolled in QL1206 and JMT, this time is the time of first use of XGEVA after leaving the group)`), origin = "1899-12-30")
Warning message:
In as.Date(as.numeric(sg2$`Time of first prescription of denosumab (if enrolled in QL1206 and JMT, this time is the time of first use of XGEVA after leaving the group)`),  :
  NAs introduced by coercion

Solution

  • Please try the below code

    sg2 %>% rename('col1' =1) %>% mutate(col11= ifelse(nchar(col1)==10, col1, NA),
                                                   col11=as.Date(col11, tryFormats='%d-%m-%Y'),
                                                   col12= ifelse(nchar(col1)==5, col1, NA),
                                                   col12= as.Date(as.numeric(col12), origin='1970-01-01'),
                                                   col1=coalesce(col11,col12)
    ) %>% select(-c(col11, col12))
    
    
    # A tibble: 5 × 4
      col1        time status Number of bone lesions (<5 = “<5”, >=5 = “≥5”) (clear, multiple bone metastases ≥5 are defin…¹
      <date>     <dbl>  <dbl> <chr>                                                                                         
    1 2021-11-23  3.68      0 >=5                                                                                           
    2 2022-09-28  9.01      0 <5                                                                                            
    3 2093-04-05 22.5       0 >=5                                                                                           
    4 2093-06-03 20.0       0 >=5                                                                                           
    5 2023-05-23 12.8       0 <5                                                                                            
    # ℹ abbreviated name:
    #   ¹​`Number of bone lesions (<5 = “<5”, >=5 = “≥5”) (clear, multiple bone metastases ≥5 are defined)`