Search code examples
regexnotepad++

Remove lines that do not contain certain domains


I would like to keep only the lines that contain one of those domains: .eu, .net, .be, .ru, .com

Please use the following list of domains as example:

example1B.com
example2F.org
exampleW3.ru
exampleHD.net
exampleC3.com
exampleVS.eu
example3Z.com
exampleC4.be
exampleC4.cz
exampleC1.org
exampleC2.be
exampleC3.xyz
exampleC5.shop
exampleC6.be
exampleC7.club
exampleC8.be
exampleC9.be
exampleC11.be

Solution

  • So you need to find lines that do NOT end with a certain string. That means a negative look-behind.

    But Notepad++ doesn't allow variable-length negative look-behinds (https://stackoverflow.com/a/17287598/2193968)

    So that means the regex we use will need more than one negative look-behind:

    ^.*$(?<!\.eu)(?<!\.net)(?<!\.be)(?<!\.ru)(?<!\.com)\r?\n
    

    So this says: match any line containing any text (including the end-of-line characters) that doesn't end with:

    • .eu
    • .net
    • .be
    • .ru
    • .com

    So then, if you replace the results of that regex with nothing, then the lines that match the regex will be deleted.