So doing this (in python 3.7.3):
>>> from re import findall
>>> s = '7.95 + 10 pieces'
>>> findall(r'(\d*\.)?\d+', s)
['7.', ''] # Expected return: ['7.95', '10']
I'm not sure why it doesn't find all the floats inside? Is this possibly some python quirk about capturing groups?
My logic behind the regex:
(\d*\.)?
matches either 1 or none of any number of digits, followed by a period.
\d+
then maches any number of digits, so this regex should match any of 11.11, 11, .11
and so on. Whats wrong here?
As you guessed correctly, this has to do with capturing groups. According to the documentation for re.findall
:
If one or more groups are present in the pattern, return a list of groups
Therefore, you need to make all your groups ()
non-capturing using the (?:)
specifier. If there are no captured groups, it will return the entire match:
>>> pattern = r'(?:\d*\.)?\d+'
>>> findall(pattern, s)
['7.95', '10']