The following python code:
import re
line="http://google.com"
procLine = re.match(r'(?<=http).*', line)
if procLine.group() == "":
print(line + ": did not match regex")
else:
print(procLine.group())
does not match successfully, and outputs the following error:
Traceback (most recent call last): File "C:/Users/myUser/Documents/myScript.py", line 5, in if procLine.group() == "": AttributeError: 'NoneType' object has no attribute 'group'
When I replace the regex with just .* it works fine which suggests it's the regex that is in error, however, on https://regex101.com/ when I test my regex and string for python flavor it appears to match fine.
Any ideas?
If you convert your lookbehind to a non-capturing group, this should work:
In [7]: re.match(r'(?:http://)(.*)', line)
Out[7]: <_sre.SRE_Match object; span=(0, 17), match='http://google.com'>
In [8]: _.group(1)
Out[8]: 'google.com'
The reason a lookbeind does not work is because - as Rawing mentioned - re.match
starts looking from the start of the string, so a lookbehind at the start of a string does not make sense.
If you insist on using a lookbehind, switch to re.search
:
In [10]: re.search(r'(?<=http://).*', line)
Out[10]: <_sre.SRE_Match object; span=(7, 17), match='google.com'>
In [11]: _.group()
Out[11]: 'google.com'