Python's re match objects have .start() and .end() methods on the match object. I want to find the start and end index of a group match. How can I do this? Example:
>>> import re
>>> REGEX = re.compile(r'h(?P<num>[0-9]{3})p')
>>> test = "hello h889p something"
>>> match = REGEX.search(test)
>>> match.group('num')
'889'
>>> match.start()
6
>>> match.end()
11
>>> match.group('num').start() # just trying this. Didn't work
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'start'
>>> REGEX.groupindex
mappingproxy({'num': 1}) # this is the index of the group in the regex, not the index of the group match, so not what I'm looking for.
The expected output above is (7, 10)
You can provide Match.start
(and Match.end
) with a group name to get the start (end) position of a group:
>>> import re
>>> REGEX = re.compile(r'h(?P<num>[0-9]{3})p')
>>> test = "hello h889p something"
>>> match = REGEX.search(test)
>>> match.start('num')
7
>>> match.end('num')
10
An advantage of this approach over using str.index
as suggested in other answers is that you do not run into problems if the group string occurs multiple times.