Search code examples
regexnotepad++

Regex search to find and remove consecutive lines which end with same characters


I need to write a regular expression search which will locate when a line ends with the same text as the preceding line, but does not have the same first 10 characters. So in this example:

[11:12:21] Hello this is Tom. How are you?
[11:14:08] Hello this is Tom. How are you?

. . . I would need to search for consecutive lines for which the text was the same after the time entered in brackets.

I know that this search:

FIND: ^.{11}(.*)$
REPLACE; $1

. . . will locate the first 11 characters and remove them.

This search:

FIND: ^((.{10}).*)(?:\r?\n\2.*)+
REPLACE: $1

. . . will locate lines where the first 10 characters are the same and remove them.

But I can't figure out how to structure the search so it checks the text from position 11 to the end of the line, and then checks if the text on the next line from the 11th character to the end of the line is the same.


Solution

  • Match this:

    ^(.*?](.*)\n).*?]\2\n
    

    and replace with $1.

    If you're using Windows, use [\r\n]+ instead of \n.

    See live demo.

    Capturing the whole line as group 1 makes the replacement simply $1.


    To work with either round brackets () or square brackets [], use the character class [\])]:

    ^(.*?[\])](.*)\n).*?[\])]\2\n
    

    See live demo.