Search code examples
notepad++

Merge lines that contain word anywhere With line that has this word as first word


I have a text made up of lines which are prefixed by a "key" word and a delimiter "|"

I want to merge the lines that contain this word anywhere to the line that has this word as the first word

Example:

large text|The,first,word,on,some,lines,end,with,this,sign

other|texts,below,it,contain,this,word,anywhere,in,the,line

what is required|merge,the,lines,containing,this,word،into,the,line

Any text is here then large text another word after that

So other texts here contain this word anywhere in the line

the end what is required to merge the lines

Desired result:

large text|The,first,word,on,some,lines,end,with,this,sign Any text is here then *large text# another word after that

other|texts,below,it,contain,this,word,anywhere,in,the,line So *other# texts here contain this word anywhere in the line

what is required|merge,the,lines,containing,this,word،into,the,line the end *what is required# to merge the lines

Is there a way to do it in Notepad++


Solution

  • This can't be done in a single pass, you have to click Replace all as many times as needed.


    • Ctrl+H
    • Find what: ^(([^|\r\n]+)\|.+$)([\s\S]+?\R)(.+?)(\b\2\b)(.+?$)
    • Replace with: $1 $4#$5*$6$3
    • TICK Match case
    • TICK Wrap around
    • SELECT Regular expression
    • UNTICK . matches newline
    • Replace all

    Explanation:

    ^               # beginning of line
    (               # group 1
        (               # group 2
            [^|\r\n]+       # 1 or more any character that is not a pipe or linebreak
        )               # end group 2
        \|              # a pipe
        .+              # 1 or more any character but newline
        $               # end of line
    )               # end group 1
    (               # group 3
        [\s\S]+?        # 1 or more any character, including newline, not greedy
        \R              # any kind of linebreak
    )               # end group 3
    (               # group 4
        .+?             # 1 or more any character but newline, not greedy
    )               # end group 4
    )               # group 5
        \b              # word boundary
        \2              # backreference to group 2, the word to be searched
        \b              # word boundary
    )               # end group 5
    (               # group 6
        .+?             # 1 or more any character but newline, not greedy
        $               # end of line
    )               # end group 6
    

    Screenshot (before):

    enter image description here

    Screenshot (after):

    enter image description here