Search code examples
regexnotepad++regex-negation

Regex negative lookahead ignore section of code


I have some regex that finds a section of code in xml. However, I want it to ignore a section of code with a particular tag set In the middle.im using notepad++. Below is a simplified version of my code.

</Question>
ABC
123
<answer>

</Question>
ABC
<Question>
123
<answer>

My regex picks up both groups but I want it to ignore the second group because of the tag

Here is the regex I’ve tried.

(?s-i)<\/Question>(?:(?!\<Question>)).*(<answer>)

Thanks for the help!


Solution

  • If you want to ignore the second group, you should not cross matching <Question> or </Question>.

    As the dot matches every character including a newline due to the (?s) the .* will match till the end of the line and will then backtrack to match <answer> matching all lines.

    You can match a single char and then check what is directly to the right is not <Question> or </Question> and make the / optional.

    You might use

    (?s-i)</Question>(?:(?!</?Question>).)*<answer>
    

    The pattern matches:

    • (?s-i) Inline modifiers, dot matches a newline and turn off case insensitive matching
    • </Question> Match literally
    • (?: Non capture group
      • (?! Negative lookahead, assert what is directly to the right is not
        • </?Question> Match either </Question> or as the ? matches 0 or 1 times
      • ). If the assertion it true, match a single any character
    • )* Close the group and optionally repeat
    • <answer>

    Regex demo

    enter image description here