Search code examples
regexnotepad++regexp-replace

Regex replacement with OR condition


I have a list of strings (thousends) that are separated with whitespace and OR conditions, for example:

Ani mal|Hu man|Pl ant|Fu ngus

And I want to get rid of the whitespace but those strings are within a huge text (XML) with a lot of intentional whitespace. So I cannot just delete all whitespace. I tried:

(Ani) (mal)|(Hu) (man)|(Pl) (ant)|(Fu) (ngus) replace with: $1$2

obviously this does not work. I am aware that I could do that in any programming language but I wanted to see if there was a way to do it with Regex only (e.g. in Notepad++).


Solution

  • When you have a fixed amount of groups per alternative, and you want to refer to the captured values (groups) in each alternative using reset indices you may leverage the branch reset group:

    (?|(Ani) (mal)|(Hu) (man)|(Pl) (ant)|(Fu) (ngus))
    ^^^  1     2  ^ 1     2  ^  1    2  ^  1     2  ^    
    

    Replace with $1$2. Due to the (?|...) group, all the alternatives inside have groups with the same indices.

    See the regex demo online.

    Notepad++ settings & demo:

    enter image description here