So I'am trying to write regular expression for complex numbers (just as an exercise to study re module). But I can't get it to work. I want regex to match strings of form: '12+18j', '-14+45j', '54', '-87j' and so on. My attempt:
import re
num = r'[+-]?(?:\d*.\d+|\d+)'
complex_pattern = rf'(?:(?P<real>{num})|(?P<imag>{num}j))|(?:(?P=real)(?P=imag))'
complex_pattern = re.compile(complex_pattern)
But it doesn't really work as I want.
m = complex_pattern.fullmatch('1+12j')
m.groupdict()
Out[166]: {'real': None, 'imag': '1+12j'}
The reason behind its structure is the fact that I want input string to contain either real or imaginary part or both. And also to be able to extract real and imag groups from match object. There is other approach i tried and it seems to work except it catches empty strings (''):
complex_pattern = rf'(?P<real>{num})+(?P<imag>{num}j)+'
complex_pattern = re.compile(complex_pattern)
I guess I could implement check for empty string simply using if. But I'm interested in more pure way and to know why first implementation doesn't work as expected.
I suggest using
import re
pattern = r'^(?!$)(?P<real>(?P<sign1>[+-]?)(?P<number1>\d+(?:\.\d+)?))?(?:(?P<imag>(?P<sign2>[+-]?)(?P<number2>\d+(?:\.\d+)?j)))?$'
texts = ['1+12j', '12+18j','-14+45j','54','-87j']
for text in texts:
match = re.fullmatch(pattern, text)
if match:
print(text, '=>', match.groupdict())
else:
print(f'{text} did not match!')
See the Python demo. Output:
1+12j => {'real': '1', 'sign1': '', 'number1': '1', 'imag': '+12j', 'sign2': '+', 'number2': '12j'}
12+18j => {'real': '12', 'sign1': '', 'number1': '12', 'imag': '+18j', 'sign2': '+', 'number2': '18j'}
-14+45j => {'real': '-14', 'sign1': '-', 'number1': '14', 'imag': '+45j', 'sign2': '+', 'number2': '45j'}
54 => {'real': '54', 'sign1': '', 'number1': '54', 'imag': None, 'sign2': None, 'number2': None}
-87j => {'real': '-8', 'sign1': '-', 'number1': '8', 'imag': '7j', 'sign2': '', 'number2': '7j'}
See the regex demo.
Details
^
- start of string(?!$)
- no end of string should follow at this position (no empty input is allowed)(?P<real>(?P<sign1>[+-]?)(?P<number1>\d+(?:\.\d+)?))?
- a "real" group:
(?P<sign1>[+-]?)
- an optional -
or +
sign captured into Group "sign1"(?P<number1>\d+(?:\.\d+)?)
- one or more digits followed with an optional sequence of a .
and one or more digits captured into Group "number1"(?P<imag>(?P<sign2>[+-]?)(?P<number2>\d+(?:\.\d+)?j))?
- an optional sequence captured into "imag" group:
(?P<sign2>[+-]?)
- an optional -
or +
sign captured into Group "sign2"(?P<number2>\d+(?:\.\d+)?j)
- one or more digits followed with an optional sequence of a .
and one or more digits and then a j
char captured into Group "number2"$
- end of string.