Search code examples
rregexstr-replacestringr

Stringr str_replace_all misses repeated terms


I'm having an issue with the stringr::str_replace_all function. I'm trying to replace all instances of iv with insuredvehicle, but the function only seems to catch the first term.

temp_data <- data.table(text = 'the driver of the 1st vehicle hit the iv iv at a stop')
temp_data[, new_text := stringr::str_replace_all(pattern = ' iv ', replacement = ' insuredvehicle ', string = text)]

The outcome looks like the following, which missed the 2nd iv term:

1: the driver of the 1st vehicle hit the insuredvehicle iv at a stop

I believe the issue is that the 2 instances share a space, which is part of the search pattern. I did that because I want to replace the iv term, and not iv within driver.

I DON'T want to simply consolidate the repeated terms to 1. I'd like the result to look like:

1: the driver of the 1st vehicle hit the insuredvehicle insuredvehicle at a stop

I'd appreciate any help getting this to work!


Solution

  • Maybe if you include a word boundary in your regex, than remove the white spaces from the replacement? It is ideal when you want just a full word matching the pattern, but not parts of words, while staying away from these blank space issues. \\bseems to do the trick

    temp_data[, new_text := stringr::str_replace_all(pattern = '\\biv\\b', replacement = 'insuredvehicle', string = text)]
    
    new_text
    
    1: the driver of the 1st vehicle hit the insuredvehicle insuredvehicle at a stop