Search code examples
rregexgsubstringrstringi

String replace with regex condition


I have a pattern that I want to match and replace with an X. However, I only want the pattern to be replaced if the preceding character is either an A, B or not preceeded by any character (beginning of string).

I know how to replace patterns using the str_replace_all function but I don't know how I can add this additional condition. I use the following code:

library(stringr)

string <- "0000A0000B0000C0000D0000E0000A0000"
pattern <- c("XXXX")



replacement <- str_replace_all(string, pattern, paste0("XXXX"))

Result:

[1] "XXXXAXXXXBXXXXCXXXXDXXXXEXXXXAXXXX"

Desired result:

Replacement only when preceding charterer is A, B or no character:

[1] "XXXXAXXXXBXXXXC0000D0000E0000AXXXX"

Solution

  • You may use

    gsub("(^|[AB])0000", "\\1XXXX", string)
    

    See the regex demo

    Details

    • (^|[AB]) - Capturing group 1 (\1): start of string (^) or (|) A or B ([AB])
    • 0000 - four zeros.

    R demo:

    string <- "0000A0000B0000C0000D0000E0000A0000"
    pattern <- c("XXXX")
    gsub("(^|[AB])0000", "\\1XXXX", string)
    ## -> [1] "XXXXAXXXXBXXXXC0000D0000E0000AXXXX"