Search code examples
pythonregexcomplex-numbers

Regular expression for complex numbers


So I'am trying to write regular expression for complex numbers (just as an exercise to study re module). But I can't get it to work. I want regex to match strings of form: '12+18j', '-14+45j', '54', '-87j' and so on. My attempt:

import re

num = r'[+-]?(?:\d*.\d+|\d+)'
complex_pattern = rf'(?:(?P<real>{num})|(?P<imag>{num}j))|(?:(?P=real)(?P=imag))'
complex_pattern = re.compile(complex_pattern)

But it doesn't really work as I want.

m = complex_pattern.fullmatch('1+12j')
m.groupdict()

Out[166]: {'real': None, 'imag': '1+12j'}

The reason behind its structure is the fact that I want input string to contain either real or imaginary part or both. And also to be able to extract real and imag groups from match object. There is other approach i tried and it seems to work except it catches empty strings (''):

complex_pattern = rf'(?P<real>{num})+(?P<imag>{num}j)+'
complex_pattern = re.compile(complex_pattern)

I guess I could implement check for empty string simply using if. But I'm interested in more pure way and to know why first implementation doesn't work as expected.


Solution

  • I suggest using

    import re
    pattern = r'^(?!$)(?P<real>(?P<sign1>[+-]?)(?P<number1>\d+(?:\.\d+)?))?(?:(?P<imag>(?P<sign2>[+-]?)(?P<number2>\d+(?:\.\d+)?j)))?$'
    texts = ['1+12j', '12+18j','-14+45j','54','-87j']
    for text in texts:
        match = re.fullmatch(pattern, text)
        if match:
            print(text, '=>', match.groupdict())
        else:
            print(f'{text} did not match!')
    

    See the Python demo. Output:

    1+12j => {'real': '1', 'sign1': '', 'number1': '1', 'imag': '+12j', 'sign2': '+', 'number2': '12j'}
    12+18j => {'real': '12', 'sign1': '', 'number1': '12', 'imag': '+18j', 'sign2': '+', 'number2': '18j'}
    -14+45j => {'real': '-14', 'sign1': '-', 'number1': '14', 'imag': '+45j', 'sign2': '+', 'number2': '45j'}
    54 => {'real': '54', 'sign1': '', 'number1': '54', 'imag': None, 'sign2': None, 'number2': None}
    -87j => {'real': '-8', 'sign1': '-', 'number1': '8', 'imag': '7j', 'sign2': '', 'number2': '7j'}
    

    See the regex demo.

    Details

    • ^ - start of string
    • (?!$) - no end of string should follow at this position (no empty input is allowed)
    • (?P<real>(?P<sign1>[+-]?)(?P<number1>\d+(?:\.\d+)?))? - a "real" group:
      • (?P<sign1>[+-]?) - an optional - or + sign captured into Group "sign1"
      • (?P<number1>\d+(?:\.\d+)?) - one or more digits followed with an optional sequence of a . and one or more digits captured into Group "number1"
    • (?P<imag>(?P<sign2>[+-]?)(?P<number2>\d+(?:\.\d+)?j))? - an optional sequence captured into "imag" group:
      • (?P<sign2>[+-]?) - an optional - or + sign captured into Group "sign2"
      • (?P<number2>\d+(?:\.\d+)?j) - one or more digits followed with an optional sequence of a . and one or more digits and then a j char captured into Group "number2"
    • $ - end of string.