Search code examples
pythonpython-3.xcomplex-numbers

Converting Python complex string output like (-0-0j) into an equivalent complex string


In Python, I'd like a good way to convert its complex number string output into an equivalent string representation which, when interpreted by Python, gives the same value.

Basically I'd like function complexStr2str(s: str): str that has the property that eval(complexStr2str(str(c))) is indistinguishable from c, for any c whose value is of type complex. However complexStr2str() only has to deal with the kinds of string patterns that str() or repr() output for complex values. Note that for complex values str() and repr() do the same thing.

By "indistinguishable" I don't mean == in the Python sense; you can define (or redefine) that to mean anything you want; "indistinguishable" means that if you have string a in a program which represents some value, and replace that in the program with string b (which could be exactly a), then there is no way to tell the difference between the running of the Python program and the replacement program, short of introspection of the program .

Note that (-0-0j) is not the same thing as -0j although the former is what Python will output for str(-0j) or repr(-0j). As shown in the interactive session below, -0j has real and imaginary float parts -0.0 while -0-0j has real and imaginary float parts positive 0.0.

The problem is made even more difficult in the presence of values like nan and inf. Although in Python 3.5+ ish you can import these values from math, for various reasons, I'd like to avoid having to do that. However using float("nan") is okay.

Consider this Python session:

>>> -0j
(-0-0j)
>>> -0j.imag
-0.0
>>> -0j.real
-0.0
>>> (-0-0j).imag
0.0  # this is not -0.0
>>> (-0-0j).real
0.0  # this is also not -0.0
>>> eval("-0-0j")
0j # and so this is -0j
>>> atan2(-0.0, -1.0)
-3.141592653589793
>>> atan2((-0-0j).imag, -1.0)
3.141592653589793
>>> -1e500j
(-0-infj)
>>> (-0-infj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'infj' is not defined

Addendum:

This question has generated something of a stir (e.g. there are a number of downvotes for this question and its accepted solution). And there have been a lot of edits to the question, so some of the comments might be out of date.

The main thrust of the criticism is that one shouldn't want to do this. Parsing data from text from some existing program is a thing that happens all the time, and sometimes you just can't control the program that generated the data.

A related problem where one can control the outputter program but one needs to have it appear in text, is to write a better repr() function that works better for floats and complex numbers and follows the principle described at the end. It is straightforward to do that, even if it is a little ugly because to do it fully you also need to handle float/complex in composite types like lists, tuples, sets, and dictionaries.

Finally, I'll say that it appears that Python's str() or repr() output for complex values is unhelpful, which is why this problem is more specific to Python than other languages that support complex numbers as a primitive datatype or via a library.

Here is a session that shows this:

>>> complex(-0.0, -0.0)
(-0-0j)  # confusing and can lead to problems if eval'd
>>> repr(complex(-0.0, -0.0))
'(-0-0j)' # 'complex(-0.0, -0.0)' would be the simplest, clearest, and most useful

Note that str() gets called when doing output such as via print(). repr() is the preferred method for this kind of use but here it is the same as str() and both have problems with things like inf and nan.

For any built-in type (eval(repr(c)) should be indistinguisable from c.


Solution

  • As @wim has noted in the comments, this is probably not the right solution to the real problem; it would be better to not have converted those complex numbers to strings via str in the first place. It's also quite unusual to care about the difference between positive and negative zero. But I can imagine rare situations where you do care about that difference, and getting access to the complex numbers before they get str()'d isn't an option; so here's a direct answer.

    We can match the parts with a regex; [+-]?(?:(?:[0-9.]|[eE][+-]?)+|nan|inf) is a bit loose for matching floating point numbers, but it will do. We need to use str(float(...)) on the matched parts to make sure they are safe as floating point strings; so e.g. '-0' gets mapped to '-0.0'. We also need special cases for infinity and NaN, so they are mapped to the executable Python code "float('...')" which will produce the right values.

    import re
    
    FLOAT_REGEX = r'[+-]?(?:(?:[0-9.]|[eE][+-]?)+|nan|inf)'
    COMPLEX_PATTERN = re.compile(r'^\(?(' + FLOAT_REGEX + r'\b)?(?:(' + FLOAT_REGEX + r')j)?\)?$')
    
    def complexStr2str(s):
        m = COMPLEX_PATTERN.match(s)
        if not m:
            raise ValueError('Invalid complex literal: ' + s)
    
        def safe_float(t):
            t = str(float(0 if t is None else t))
            if t in ('inf', '-inf', 'nan'):
                t = "float('" + t + "')"
            return t
    
        real, imag = m.group(1), m.group(2)
        return 'complex({0}, {1})'.format(safe_float(real), safe_float(imag))
    

    Example:

    >>> complexStr2str(str(complex(0.0, 0.0)))
    'complex(0.0, 0.0)'
    >>> complexStr2str(str(complex(-0.0, 0.0)))
    'complex(-0.0, 0.0)'
    >>> complexStr2str(str(complex(0.0, -0.0)))
    'complex(0.0, -0.0)'
    >>> complexStr2str(str(complex(-0.0, -0.0)))
    'complex(-0.0, -0.0)'
    >>> complexStr2str(str(complex(float('inf'), float('-inf'))))
    "complex(float('inf'), float('-inf'))"
    >>> complexStr2str(str(complex(float('nan'), float('nan'))))
    "complex(float('nan'), float('nan'))"
    >>> complexStr2str(str(complex(1e100, 1e-200)))
    'complex(1e+100, 1e-200)'
    >>> complexStr2str(str(complex(1e-100, 1e200)))
    'complex(1e-100, 1e+200)'
    

    Examples for string inputs:

    >>> complexStr2str('100')
    'complex(100.0, 0.0)'
    >>> complexStr2str('100j')
    'complex(0.0, 100.0)'
    >>> complexStr2str('-0')
    'complex(-0.0, 0.0)'
    >>> complexStr2str('-0j')
    'complex(0.0, -0.0)'