Search code examples
rregexreplacewhitespacedetection

How to replace one ore more empty space or white spaces in a string of 6 number by zeros?


I have the following dataframe

dput(head(phone_numbers_df)) 

structure(list(phone_number = c("30 969166", "31    8941", "32   34057",  "33   24021", "34 685284", "36 226317"), prefix = c("30", "31",  "32", "33", "34", "36")), row.names = c(NA, -6L), class = c("tbl_df",  "tbl", "data.frame"))

there should always be 6 numbers in the suffix of phone_number. some have 5 or 4 since generated from random numbers. How to replace the white spaces by zeros?

I have tried the following but the regex used does not seem to detect the white spaces properly

 library(tidyverse)

fixed_numbers <- phone_numbers_df %>%
  mutate(needs_replacement = str_detect(needs_replacement , "\\s{1,6}$")) %>%
  mutate_at(vars(needs_replacement), ~ ifelse(. == TRUE, str_replace(., "\\s{1,6}$", "000000"), .))

# Display the fixed phone numbers
print(fixed_numbers)

thanks in advance for any help!


Solution

  • You don't need to first detect and then replace. According to your description and code, you want to replace each ' ' with "0" only in the last 6 characters.

    phone_numbers_df %>%
      mutate(needs_replacement = paste(prefix, str_replace_all(str_sub(phone_number, -6), '\\s', '0'), sep = ' ')) 
    # A tibble: 6 x 3
      phone_number prefix needs_replacement
      <chr>        <chr>  <chr>            
    1 30 969166    30     30 969166        
    2 31    8941   31     31 008941        
    3 32   34057   32     32 034057        
    4 33   24021   33     33 024021        
    5 34 685284    34     34 685284        
    6 36 226317    36     36 226317   
    

    Perhaps this looks better. Leaving random white spaces between prefix and number does not make sense to me. You can even get rid of the white spaces at all if you wish.