I want to return a match on a TV series url:
YES: http://www.rottentomatoes.com/tv/falling-skies/
But not on a TV episode or TV season
NO: http://www.rottentomatoes.com/tv/falling-skies/s03
NO: http://www.rottentomatoes.com/tv/falling-skies/s12/e01
I currently have the following regex:
match = re.match('(http(s)?://)?(www.)?rottentomatoes.com/tv/.+', url)
This matches all three of the above. How would I construct the regex to only match the first one?
Use a negated character class instead of .+
:
^http://www\.rottentomatoes\.com/tv/[^/]+/?$
[^/]+
matches any character that is not a slash, one or more times — which is everything from tv/
until the next slash (or the end of the string if a /
is not present).