Search code examples
regexsublimetext3

Replacing strings and characters inside urls


I'm trying to replace characters inside urls starting with a certain pattern and to add ".html" at the end. I'm using Sublime Text but can't make it work completely.

I would like to do 3 switches:

  1. https://my.master.com to https__my.master.com
  2. adding ".html" at the end of each url
  3. replacing "/" inside these urls with "_"

Important: Only for urls starting with href="https://my.master.com/@

Example:

href="https://my.master.com/@top-com/d/my-zen/"

Result I would like to get:

href="https__my.master.com_@top-com_d_my-zen_.html"

What I've tried so far:

Inside Sublime Text, I've tried to put inside the find field:

href="https://my.master.com/@([^\"]+)

and inside the replace field:

href="https__my.master.com_@$1.html"

It works for 1) and 2) but not 3). I don't know how to also replace "/" with "_"

The output I get with my regex:

https__my.master.com_@top-com/d/my-zen/.html

It's not something for production

I don't mind doing 2 find and replace in a row if it's easier for you to help me.

Thanks in advance!


Solution

  • Depending on how many subdirs you might have, you can likely use some expressions similar to:

    href="https:\/\/my\.master\.com\/@([^\/]*)\/([^\/]*)\/([^\/]*)\/"
    href="https:\/\/my\.master\.com\/@([^\/]*)\/([^\/]*)\/([^\/]*)\/([^\/]*)\/"
    

    and replace those with some similar strings:

    href="https:__my.master.com_@$1_$2_$3_.html"
    href="https:__my.master.com_@$1_$2_$3_$4_.html"
    

    a few times.

    Demo


    If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.