Search code examples
pythonregexregex-lookaroundsregex-groupregex-greedy

RegEx for matching specific URLs


I'm trying to write a regex in python that that will either match a URL (for example https://www.foo.com/) or a domain that starts with "sc-domain:" but doesn't not have https or a path.

For example, the below entries should pass

https://www.foo.com/
https://www.foo.com/bar/
sc-domain:www.foo.com

However the below entries should fail

htps://www.foo.com/
https:/www.foo.com/bar/
sc-domain:www.foo.com/
sc-domain:www.foo.com/bar
scdomain:www.foo.com

Right now I'm working with the below:

^(https://*/|sc-domain:^[^/]*$)

This almost works, but still allows submissions like sc-domain:www.foo.com/ to go through. Specifically, the ^[^/]*$ part doesn't capture that a '/' should not pass.


Solution

  • ^((?:https://\S+)|(?:sc-domain:[^/\s]+))$
    

    You can try this.

    See demo.

    https://regex101.com/r/xXSayK/2