Search code examples
rtidyversestringr

how can I separate character versus digital with separate function


I have the below simple example

library(tidyverse)
dd = data.frame(xx=c("sdsds1234","ddd252","rrr34566"))
dd %>% separate(col = xx,remove =F,into = c("Name","MedID"))
         xx      Name MedID
1 sdsds1234 sdsds1234  <NA>
2    ddd252    ddd252  <NA>
3  rrr34566  rrr34566  <NA>

However, what I want is to separate letters and digital numbers like

         xx      Name MedID
1 sdsds1234 sdsds  1234
2    ddd252    ddd  252
3  rrr34566  rrr  34566

Solution

  • Here's one way using extract -

    library(tidyr)
    
    extract(dd, xx, c("Name", "MedID"), "([a-z]+)(\\d+)", remove = FALSE)
    
    #         xx  Name MedID
    #1 sdsds1234 sdsds  1234
    #2    ddd252   ddd   252
    #3  rrr34566   rrr 34566
    

    And since both separate and extract are superseded we can use the new separate_wider_regex function.

    dd %>% 
        separate_wider_regex(xx, 
             c(Name = "[a-z]+", MedID = "\\d+"), cols_remove = FALSE)