Search code examples
regexcapturing-group

RegExp How to link matching beginning/end strings for search/replace?


I have an XML file and need to find and replace all occurrences of a certain pattern enclosed between either of two pairs of tags, but not mixed between them.

Example

 a) <CnlNum>548</CnlNum>
 b) GetBit(Val(548), 3)

548 is the sample text I need to find and change to 1548. (actually any 3 digit number between 500 and 999 must be replaced by the number+1000).

I used the following:

Search regex:

(<CnlNum>|Val\s?\()([5-9])(\d{2})(</CnlNum>|\),\s?)

Replace regex:

($1)1$2$3$4

The issue is this searching regex, although a working one, doesn't only match corresponding pairs of startin/ending strings, but would also find the following, too, which would be wrong:

<CnlNum>548),

I wonder how can I link the starting/ending texts in the regex?

I think this would be useful for linking matching XML or HTML tags (<tag>...</tag>).


Solution

  • You can capture the alternatives inside the first group and then use a conditional construct:

    ((<CnlNum>)|(Val\s?\())([5-9])(\d{2})((?(2)</CnlNum>|\),\s?))
     ^        ^ ^        ^                ^^^^^         ^      ^
    

    See the regex demo. Details:

    • ((<CnlNum>)|(Val\s?\()) - Group 1: either <CnlNum> (Group 2) or Val + an optional whitespace and a ( char (Group 3)
    • ([5-9]) - Group 4: a digit from 5 to 9
    • (\d{2}) - Group 5: any two digits
    • ((?(2)</CnlNum>|\),\s?)) - Group 6: if Group 2 matched, match </CnlNum> else, match a ), comma, and an optional whitespace.