I am using python 3.9.13 version.
I am trying to use the findall function from re library but I am getting empty results.
The regex I am using is:
_regex = re.compile(r"(?:)0\d{1,4}(?:-?\d{2,4}-?\d{2,4}|\d{8}|\d(-)?\d{2,4}(-)?\d{3,4})")
I am testing that on the following string:
_text = "06-6206-567903-3668-067403-3668-400503-3668-429503-3668-432403-3668-039206-6206-572906-6206-630303-3668-481806-6206-564403-3668-053703-3668-070606-6206-5663"
From re.finditer, I am getting correct results:
_test = re.finditer(_regex, _text)
for item in _test:
print(item)
<re.Match object; span=(0, 12), match='06-6206-5679'>
<re.Match object; span=(12, 24), match='03-3668-0674'>
<re.Match object; span=(24, 36), match='03-3668-4005'>
<re.Match object; span=(36, 48), match='03-3668-4295'>
<re.Match object; span=(48, 60), match='03-3668-4324'>
<re.Match object; span=(60, 72), match='03-3668-0392'>
<re.Match object; span=(72, 84), match='06-6206-5729'>
<re.Match object; span=(84, 96), match='06-6206-6303'>
<re.Match object; span=(96, 108), match='03-3668-4818'>
<re.Match object; span=(108, 120), match='06-6206-5644'>
<re.Match object; span=(120, 132), match='03-3668-0537'>
<re.Match object; span=(132, 144), match='03-3668-0706'>
<re.Match object; span=(144, 156), match='06-6206-5663'>
However, when using the re.findall function, I am getting empty results.
_test = re.findall(_regex, _text)
[('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', ''), ('', '')]
I am wondering if this problem comes from the regex I am using (maybe the first non-capturing group ?). Please help.
After some testing, I believe that in the regular expression you wrote
(?:)0\d{1,4}(?:-?\d{2,4}-?\d{2,4}|\d{8}|\d(-)?\d{2,4}(-)?\d{3,4})
^ ^ ^ ^
the two pairs of parentheses shown are making these groups for the expression to match, thus when using re.findall
the groups are None
types when accessed by item.group(1. 2)
. When the parentheses are removed to form a regular expression like this
(?:)0\d{1,4}(?:-?\d{2,4}-?\d{2,4}|\d{8}|\d-?\d{2,4}-?\d{3,4})
expected result is produced from re.findall
['06-6206-5679', '03-3668-0674', '03-3668-4005', '03-3668-4295',
'03-3668-4324', '03-3668-0392', '06-6206-5729', '06-6206-6303',
'03-3668-4818', '06-6206-5644', '03-3668-0537', '03-3668-0706',
'06-6206-5663']
also, re.finditer
gives
<re.Match object; span=(0, 12), match='06-6206-5679'>
<re.Match object; span=(12, 24), match='03-3668-0674'>
<re.Match object; span=(24, 36), match='03-3668-4005'>
<re.Match object; span=(36, 48), match='03-3668-4295'>
<re.Match object; span=(48, 60), match='03-3668-4324'>
<re.Match object; span=(60, 72), match='03-3668-0392'>
<re.Match object; span=(72, 84), match='06-6206-5729'>
<re.Match object; span=(84, 96), match='06-6206-6303'>
<re.Match object; span=(96, 108), match='03-3668-4818'>
<re.Match object; span=(108, 120), match='06-6206-5644'>
<re.Match object; span=(120, 132), match='03-3668-0537'>
<re.Match object; span=(132, 144), match='03-3668-0706'>
<re.Match object; span=(144, 156), match='06-6206-5663'>