Search code examples
csvnotepad++bulk

clearing a list of domains / remove subdomains


I have a large list of domains which is mixed with subdomains.

google.de
spiegel.de
sub1.google.de
zeit.de
sub1.spiegel.de

Is there a tool or a workaround in an editor (e.g. Sublime or Notepad++) to get all before domain.tld deleted?

Notepad++ helped me to get this list cleared from all the other content, which has been inside. But at this point a stopped.


Solution

  • If you mean removing the ***.something.something, search for regular expression

    ^.*\..*\..*
    

    and replace with nothing.

    EDIT: version 2, where you want to keep part of the line (a duplicate in your example):

    ^.*\.(.*\..*)
    

    replaced with

    $1
    

    The () and . take their regular expression meaning (grouping and any character), and the \. escapes the . to find that character. $1 gets what's inside the (first) set of () if the search expression (as $2 would get the second part if present...)