Search code examples
pcre

Conditional replacement of a character


I would like to replace a character in a long string only if a special sequence is present in the input. Example:

This string is a sample! I wrote it to describe my problem! I hope somebody can help me with this! I have the ID: 12345! That's all!

My desired output is:

This string is a sample. I wrote it to describe my problem. I hope somebody can help me with this. I have the ID: 12345. That's all.

Only when '12345' present in the input string.

I tried (positive|negative) look(ahead|behind)

(?<!=12345)(!+(.*))+

Does not work, so as ?=, ?!...

Is this possible with PCRE replacement in one step?


Solution

  • In general, this is possible with any regex flavor supporting \G "string start/end of the previous match" operator. You may replace with $1 + desired text when searching with the following patterns:

    (?:\G(?!^)|^(?=.*CHECKME))(.*?)REPLACEME     <-- Replace REPLACEME if CHECKME is present
    (?:\G(?!^)|^(?!.*CHECKME))(.*?)REPLACEME     <-- Replace REPLACEME if CHECKME is absent
    

    With Perl/PCRE/Onigmo that support \K, you may replace with your required text when searching with

    (?:\G(?!^)|^(?=.*CHECKME)).*?\KREPLACEME     <-- Replace REPLACEME if CHECKME is present
    (?:\G(?!^)|^(?!.*CHECKME)).*?\KREPLACEME     <-- Replace REPLACEME if CHECKME is absent
    

    In your case, since the text searched for is a single character, you may use a more efficient regex with just one .*:

    (?:\G(?!^)|^(?=.*12345))[^!]*\K!
    

    and replace with . (or with $1. if you use (?:\G(?!^)|^(?=.*12345))([^!]*)!). See the regex demo.

    If there can be line breaks in the string use (?s)(?:\G(?!^)|^(?=.*12345))[^!]*\K!.

    Details

    • (?:\G(?!^)|^(?=.*12345)) - either the end of the previous match (\G(?!^)) or (|) the start of a string position followed with any 0+ chars as many as possible up to the last occurrence of 12345 (^(?=.*12345))
    • [^!]* - 0 or more chars other than !
    • \K - match reset operator that discards all text matched so far in the match memory buffer
    • ! - a ! char.