Search code examples
notepad++

How to Remove Consecutive Duplicate Lines According to styled texts only?


I have a text list file like following:

NEW! Most Popular 001.jpg - 1227 px - 244 px - 2507 px - 1748 px
NEW! Most Popular 002.jpg - 1227 px - 244 px - 2507 px - 1748 px
NEW! Most Popular 003.jpg - 1227 px - 244 px - 2507 px - 1748 px
NEW! Most Popular 004.jpg - 1227 px - 244 px - 2507 px - 1748 px
NEW! Most Popular 005.jpg - 1226 px - 244 px - 2508 px - 1748 px
NEW! Most Popular 006.jpg - 1339 px - 293 px - 2395 px - 1765 px
NEW! Most Popular 007.jpg - 1316 px - 290 px - 2418 px - 1760 px
NEW! Most Popular 008.jpg - 1316 px - 290 px - 2418 px - 1760 px
NEW! Most Popular 009.jpg - 1272 px - 232 px - 2462 px - 1760 px
NEW! Most Popular 010.jpg - 1272 px - 232 px - 2462 px - 1760 px
NEW! Most Popular 011.jpg - 1332 px - 228 px - 2402 px - 1764 px

Now I want to Bookmark all texts after second - using .jpg - .* regex and make Remove Consecutive Duplicate Lines on bookmarked texts using notepad++ edit > Line Operations

this mean only following must keep in my list:

NEW! Most Popular 001.jpg - 1227 px - 244 px - 2507 px - 1748 px
NEW! Most Popular 005.jpg - 1226 px - 244 px - 2508 px - 1748 px
NEW! Most Popular 006.jpg - 1339 px - 293 px - 2395 px - 1765 px
NEW! Most Popular 008.jpg - 1316 px - 290 px - 2418 px - 1760 px
NEW! Most Popular 010.jpg - 1272 px - 232 px - 2462 px - 1760 px
NEW! Most Popular 011.jpg - 1332 px - 228 px - 2402 px - 1764 px

I really sorry but I didn't find any solution for this
I hope you completely understand what I want do


Solution

    • Ctrl+H
    • Find what: ^([^-]+-[^-]+-)(.+$)(?:\R(?1)\2)+
    • Replace with: $1$2
    • TICK Match case
    • TICK Wrap around
    • SELECT Regular expression
    • UNTICK . matches newline
    • Replace all

    Explanation:

    ^           # beginning of line
    (           # start group 1
        [^-]+       # 1 or more non hyphen
        -           # 1 hyphen
        [^-]+       # 1 or more non hyphen
        -           # 1 hyphen
    )           # end group 1
    (.+$)       # group 2 any character until end of line
    (?:         # non capture group
    \R          # any kind of linebreak
        (?1)        # same pattern as defined in group 1
        \2          # same occurrence of value contained in group 2
    )+          # end group, must appear 1 or more times
    

    Screenshot (before):

    enter image description here

    Screenshot (after):

    enter image description here