Search code examples
regexurldnstrim

Trimming URLs via Regex, but not to root


I have many URLs, about 100K large.

It looks something like this:

blog.example.com/ilovecats/2011/02/10/the-bling-ring/
blog.example.com/fas24
blog.example.com/morg
blog.example.com/whistlermoar/
blog.example.com/punny/
blog.example.com/punny/2012/10/
blog.example.com/punny/2012/10/01/my-mom-is-alien/
blog.example.com/anniesblog/2012/10/12/i-lost-my-iphone
blog.example.com/anniesblog/2012/10/page/3/
blog.example.com/anniesblog/2012/10/page/4
blog.example.com/anniesblog/2012/10/page/5
blog.example.com/alfva/
blog.example.com/dudewheresmycar/
blog.example.com/mynameisbilly/
blog.example.com/mynameisbilly/page/23/
blog.example.com/anotherflower/category/axel/
blog.example.com/naxramas/
blog.example.com/angeleoooo/
blog.example.com/angeleoooo/2011/01/01/
blog.example.com/angeleoooo/2011/01/01/happynew-years/

I want everything after example.com/username/ to be removed, so the remaining list will look something like this:

blog.example.com/ilovecats/
blog.example.com/fas24
blog.example.com/morg
blog.example.com/whistlermoar/
blog.example.com/punny/
blog.example.com/anniesblog/
blog.example.com/alfva/
blog.example.com/dudewheresmycar/
blog.example.com/mynameisbilly/
blog.example.com/anotherflower/
blog.example.com/naxramas/
blog.example.com/angeleoooo/

I heard that Regex is a way of doing this, so I have been googeling around about this for some hours now and I am about to run out of time.

Can someone help me?

(Got Notepad++ installed)


Solution

  • You can use:

    (blog.example.com/\w+\/?).*
    

    Put this in Find and be sure to pick 'Regular expression' in the Search Mode.

    In the Replace, put:

    \1
    

    And replace all.