Search code examples
regex

Regex with lookbehind searching for substring


I need help with a Regex. I want to replace all occurrences of 5% or 7% or 183% or 99% to `` (empty string) BUT if the occurrence is 0% or 100% (it could be dozens of occurrences that I want to preserve) I dont want to do anything.

For example:

aaa 0% bbb should become aaa 0% bbb (nothing changed)

However

aaa 40% bbb should become aaa bbb

I came up with a regex using negative lookbehind but it just removes the % sign, not the number. This is the regex:

replace (?<!(0|100))% with `` (empty string)

The regex above, when applied to the string aaa 40% bbb returns aaa 40 bbb.


Solution

  • This regex pattern will keep the 0% and 100%, and remove any other percentage (number, sign, and the following space).

    The pattern first makes sure there is s preceding whitespace. Then it will try to match 100% or 0% followed by whitespace. If it matches, the overall match is replaced by the matching 100% or 0% followed by the matching whitespace character (i.e. no change to matched string). If no match match, it will match any digits followed by % and a whitespace and replace it with an empty string, effectively erasing the matched percentage and one whitespace character.

    REGEX (PCRE2 Flavor)

    $pattern = '(?<=\s)(?:((?:10)?0%\s)|\d+%\s)'
    $replacement = `$1`
    

    Regex Demo: https://regex101.com/r/VCOPJg/3

    NOTES:

    • (?<=\s) Positive lookbehind. Matches if there is a preceding whitespace character (```````).
    • (?:...) Non-capturing group.
    • ((?:10)?0%\s) Capturing group (...) number 1. In the $replacement string referred to as $1.
    • (?:10)? Non-capturing group. Optional ?, match 0 or 1 times. Match literal 1 followed by literal 0.
    • 0% Match literal 0 followed by literal %.
    • | OR.
    • \d+%\s Match 1 or more (+) digits (\d) followed by a whitespace character (\s).

    TEST STRING

    aaa 0% bbb
    aaa 100% bbb
    aaa 40% bbb
    aaa 555% bbb
    aaa 77% bbbaaa 577% bbb 100%, 0%, 50% aaa, bbb (%%%444%%%)
    For example:
    aaa 0% bbb should become aaa 0% bbb (nothing changed)
    However
    aaa 40% bbb should become aaa bbb
    

    RESULT:

    aaa 0% bbb
    aaa 100% bbb
    aaa bbb
    aaa bbb
    aaa bbbaaa bbb 100%, 0%, aaa, bbb (%%%444%%%)
    For example:
    aaa 0% bbb should become aaa 0% bbb (nothing changed)
    However
    aaa bbb should become aaa bbb