I have the following dataframe
dput(head(phone_numbers_df))
structure(list(phone_number = c("30 969166", "31 8941", "32 34057", "33 24021", "34 685284", "36 226317"), prefix = c("30", "31", "32", "33", "34", "36")), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
there should always be 6 numbers in the suffix of phone_number. some have 5 or 4 since generated from random numbers. How to replace the white spaces by zeros?
I have tried the following but the regex used does not seem to detect the white spaces properly
library(tidyverse)
fixed_numbers <- phone_numbers_df %>%
mutate(needs_replacement = str_detect(needs_replacement , "\\s{1,6}$")) %>%
mutate_at(vars(needs_replacement), ~ ifelse(. == TRUE, str_replace(., "\\s{1,6}$", "000000"), .))
# Display the fixed phone numbers
print(fixed_numbers)
thanks in advance for any help!
You don't need to first detect and then replace. According to your description and code, you want to replace each ' ' with "0" only in the last 6 characters.
phone_numbers_df %>%
mutate(needs_replacement = paste(prefix, str_replace_all(str_sub(phone_number, -6), '\\s', '0'), sep = ' '))
# A tibble: 6 x 3
phone_number prefix needs_replacement
<chr> <chr> <chr>
1 30 969166 30 30 969166
2 31 8941 31 31 008941
3 32 34057 32 32 034057
4 33 24021 33 33 024021
5 34 685284 34 34 685284
6 36 226317 36 36 226317
Perhaps this looks better. Leaving random white spaces between prefix and number does not make sense to me. You can even get rid of the white spaces at all if you wish.