How can I find all matches that don't necessarily consume all characters with *
and +
modifiers?
import regex as re
matches = re.findall("^\d+", "123")
print(matches)
# actual output: ['123']
# desired output: ['1', '12', '123']
I need the matches to be anchored to the start of the string (hence the ^
), but the +
doesn't even seem to be considering shorter-length matches. I tried adding overlapped=True
to the findall
call, but that does not change the output.
Making the regex non-greedy (^\d+?
) makes the output ['1']
, overlapped=True
or not. Why does it not want to keep searching further?
I could always make shorter substrings myself and check those with the regex, but that seems rather inefficient, and surely there must be a way for the regex to do this by itself.
s = "123"
matches = []
for length in range(len(s)+1):
matches.extend(re.findall("^\d+", s[:length]))
print(matches)
# output: ['1', '12', '123']
# but clunky :(
Edit: the ^\d+
regex is just an example, but I need it to work for any possible regex. I should have stated this up front, my apologies.
You could use overlapped=True
with the PyPi regex module and reverse searching (?r)
Then reverse the resulting list from re.findall
import regex as re
res = re.findall(r"(?r)^\d+", "123", overlapped=True)
res.reverse()
print(res)
Output
['1', '12', '123']
See a Python demo.