python numpy math precision complex-numbers

Numpy decimal points precision of complex numbers

Consider the following Python code which does some simple arithmetic operations over complex numbers in Python:

import numpy as np

s = 2
l = 5
v = np.array([np.exp(1j*2*np.pi/l)])
A = pow(s*v, l) + s*v

#Print the precision of np.complex128
print np.finfo(np.complex128).precision

#Export using 20 decimal places for real and imaginary parts
np.savetxt('A.csv', A, fmt=['%.20e%+.20ej'], delimiter=',')

I saw this answer in order to print the precision of a np.complex128 variable in Python and what I get is 15. When I check the value of A in Spyder I see (32.61803398874989+1.9021130325903035j) whose imaginary part has more than 15 decimal places. Also, the exported CSV file has the value 3.26180339887498931262e+01+1.90211303259030350965e+00j whose both real and imaginary parts have 20 useful decimal places.

I'm confused here: if the precision is 15 what are those extra decimal places? According to here, np.complex128 consists of two 64-bit floats, one for the real and one for the imaginary part.

But my main issue is that I am doing many of these operations over complex numbers (like additions, multiplications and matrix inversions) in a program and each time I run I get different result, sometimes correct, sometimes wrong. It seems that my program is very sensitive to the accuracy of these operations. How is the precision over complex numbers quantified in Python and what is the maximum I can have?

I am using Python 2.7.14 and Numpy 1.14.3.

Solution

A complete answer would take much space. For starters, you would need to read a book on numerical analysis that explains just how the binary floating point types work and are stored. But here are some partial explanations.

The 64-bit float values are stored in binary, not in decimal. So most of what is said about decimal digits must be an approximation, not the full story.

When np.finfo(np.complex128).precision (or anything else) says that there are 15 significant decimal digits, it means that you cannot count on more than those 15 digits. Here is an example, taken from my IPython console.

In [4]: 9.000000000000001
Out[4]: 9.000000000000002

In [5]: 9.000000000000001 == 9.000000000000002
Out[5]: True

Python, which by default uses the 64-bit floating point type for any number with a decimal point, treats those two 16-decimal-digit numbers as identical. When you use the number 9.000000000000001 only the first 15 decimal digits are guaranteed to be stored correctly. Anything after that is not guaranteed, and you see in this case the 1 is basically changed to a 2.

Sometimes you can get more than those 15 decimal digits. For example, since the numbers are stored in binary, a number slightly greater than 1.0 will have more binary digits after the radix point than a number like 9.0 will. This is because 9 uses 4 binary digits while 1 uses only one, so 3 more binary digits can be used after the radix point. So lets look at the imaginary part of your number, 1.9021130325903035:

In [17]: 1.902113032590303 == 1.9021130325903035
Out[17]: False

In [18]: 1.9021130325903036 == 1.9021130325903035
Out[18]: True

In [19]: 1.902113032590304 == 1.9021130325903035
Out[19]: False

We see that, although the number shows 17 decimal digits, when we change the final digit from 5 to 6 Python sees no change. But there is a change if we round that number to 16 decimal digits, either up or down. So we can say that the number is stored to 16 decimal digits and a little more. That is why when I explain precision I say that a floating point number is guaranteed to have 15 significant decimal digits but it may have a 16th, and the actual precision is slightly more than that. More briefly, there are 15 or 16 significant decimal digits.

Python has two basic ways to print a floating point number: the str function and the repr function. The str function prints things simply, so the user can understand the result without going into too much detail. The repr function gives more detail, and tries to print so much detail that the stored data can be completely determined by what is printed. Note that repr is used invisibly in some cases, such as typing a numeric value in the console or, as you saw, using the variable explorer in Spyder.

When Spyder or Python does a repr on your number 1.9021130325903035 it gives enough digits to completely define the number. As we saw above, if it showed only 16 decimal digits, dropping the final 5, the result would be slightly different from what is stored. Therefore Python prints an additional digit so you can know just what the value is. If Python printed a final 6 rather than a 5 the value would have been the same, but if Python left out that digit completely the value would be changed. So Python via repr errs on the side of giving too many digits. Despite the 17 digits printed, only 16 of them are certain.

Finally, your csv file shows 20 decimal digits because you told Python to show 20 decimal digits, due to the %.20e specifiers in your np.savetxt command. Those 20 decimal digits are not all "useful decimal places," despite what you wrote. Only the first 16 or 17 are useful in the imaginary part, depending on how you define "useful." Python basically added binary bits, all zeros, to the stored value, and printed that to 20 decimal digits. But those zero binary bits were not stored in the value.