Search code examples
regexgitnotepad++delimiter

Regex Delete Lines with delimiter `<<<<<<< HEAD` and `=======` in Reverted Commit in Git in Notepad++


Using a bad regex I accidentally deleted many lines I shouldn't have. After reverting the changes (a feature of Git Version Control), I have markdown files that look like this now:

<<<<<<< HEAD
There was a sentence here:
There was a third line here.
=======
There was a sentence here:
There was a second line here.
There was a third line here.
There were any number of lines here.
>>>>>>> parent of <commit ID> (<commit msg>)

My request is to use <<<<<<< HEAD and ======= as delimiters and delete all what's between the delimiters, including the delimiters as well. I would delete the >>>>>>> parent of <commit ID> (<commit msg>) bits separately afterwards.

My regex (.*) to match multiple lines between the delimiters was unsuccessful. I am using [{[(*"'!->1-9a-zA-ZÀ-ŰØ-űø-ÿ]+ instead of simple w+ to cater for any line-opening character/word I might want to be using. I have soft line breaks (two spaces) after each sentence, if that is important to you. (If you can match all what's between the delimiters, it might not even matter.)

Expected result:

There was a sentence here:
There was a second line here.
There was a third line here.
There were any number of lines here.
>>>>>>> parent of <commit ID> (<commit msg>)

As I said, I would deal with >>>>>>> parent of <commit ID> (<commit msg>) afterwards.
Also, it goes without saying that it is not always two lines between delimiters. Varying number of lines causes my issue.


Solution

  • Instead of using a non greedy match, you can use a negative lookahead matching lines in between that do not consist only of ======= which is more perfomant:

    ^<<<<<<< HEAD(?:\R(?!=======$).*)*+\R=======$
    

    Explanation

    • ^ Start of string
    • <<<<<<< HEAD Match literally
    • (?: Non capture group
      • \R Match any unicode newline
      • (?!=======$) Negative lookahead, assert that the line is not =======
      • .* Match the whole line
    • )*+ Close the non capture group and optionally repeat it using a possessive quantifier
    • \R Match any unicode newline
    • ======= Match literally
    • $ End of string

    Regex demo

    enter image description here