This is the regex code:
without_header = re.findall('/sports/[a-z0-9\/\.\-\:]*[0-9\.]+cms', without_header_url)
It returns me the output of each URL which doesn't have the https header in front. For example:
For this, I want to append "" in the beginning. I don't want a for loop, is there any efficient way of doing it using re.sub?
You may use this regex in re.sub
s = re.sub(r'(?<!:/)(/sports/[a-z0-9/.:-]*[0-9.]+cms)', r'https://\1', s)
RegEx Details:
: Negative lookbehind to assert that we don't have :/
at previous position(/sports/[a-z0-9/.:-]*[0-9.]+cms)
: Match your text and capture in group #1