Search code examples
rregexstringr

How can I separate words in a string without deleting letters using str_replace?


I want to separate the words in the string and add a space in between. How can I do it without deleting some letters by using str_replace?

s1 <- c("Employee_Name", "EmpID", "MarriedID", "MaritalStatusID", "GenderID", 
        "EmpStatusID") |> print()
#> [1] "Employee_Name"   "EmpID"           "MarriedID"       "MaritalStatusID"
#> [5] "GenderID"        "EmpStatusID"
s1|> 
 stringr::str_remove_all("_") |> 
 # I want to separate the  words in  the string and add a space in between
 stringr::str_replace_all("([a-z][A-Z])", " ")
#> [1] "Employe ame"   "Em D"          "Marrie D"      "Marita tatu D"
#> [5] "Gende D"       "Em tatu D"

Created on 2023-05-29 with reprex v2.0.2

I tried stringr::str_replace_all("([a-z][A-Z])", " "), but this removes the letters matched by the pattern.


Solution

  • You want to use a lookbehind and lookahead:

    library(stringr)
    
    s1 |>
      str_remove_all("_") |>
      str_replace_all("(?<=[a-z])(?=[A-Z])", " ")
    # [1] "Employee Name"     "Emp ID"            "Married ID"       
    # [4] "Marital Status ID" "Gender ID"         "Emp Status ID"
    

    Or alternatively, capture groups with backreferences:

    s1 |>
      str_remove_all("_") |>
      str_replace_all("([a-z])([A-Z])", "\\1 \\2")
    # [1] "Employee Name"     "Emp ID"            "Married ID"       
    # [4] "Marital Status ID" "Gender ID"         "Emp Status ID"