I have a dataset comprised of xdata
and ydata
that I want to fit a polynomial to, but for some reason, the fitting results depend on the dtype
of the dataset, even though the actual values of the data remain unchanged. I understand that if you change the dtype
e.g. from float
to int
, that there can be some loss of information, but in this case I am converting from 'f4'
to 'f8'
, thus no information is lost, which is why I am at a loss. What is going on here?
import numpy as np
from numpy.polynomial import polynomial
x32 = np.array([
1892.8972, 1893.1168, 1893.1626, 1893.4313, 1893.4929, 1895.6392,
1895.7642, 1896.4286, 1896.5693, 1897.313, 1898.4648
], dtype='f4')
y32 = np.array([
510.83655, 489.91592, 486.4508, 469.21814, 465.7902, 388.65576,
385.37637, 369.07236, 365.8301, 349.7118, 327.4062
], dtype='f4')
x64 = x32.astype('f8')
y64 = y32.astype('f8')
a, residuals1, _, _, _ = np.polyfit(x32, y32, 2, full=True)
b, residuals2, _, _, _ = np.polyfit(x64, y64, 2, full=True)
c, (residuals3, _, _, _) = polynomial.polyfit(x32, y32, 2, full=True)
d, (residuals4, _, _, _) = polynomial.polyfit(x64, y64, 2, full=True)
print(residuals1, residuals2, residuals3, residuals4) # [] [195.86309188] [] [195.86309157]
print(a) # [ 3.54575804e+00 -1.34738721e+04 1.28004924e+07]
print(b) # [-8.70836523e-03 7.50419309e-02 3.15525483e+04]
print(c[::-1]) # [ 3.54575804e+00 -1.34738721e+04 1.28004924e+07]
print(d[::-1]) # [-8.7083541e-03 7.5099051e-02 3.1552398e+04 ]
I also only noticed this issue because I'm also interested in the residuals values and they turned up to be empty, which caused my program to crash.
This differing behaviour is due to rcond
in polynomial
, which is precision-dependent:
rcond : float, optional
Relative condition number of the fit. Singular values smaller than
this relative to the largest singular value will be ignored. The
default value is len(x)*eps, where eps is the relative precision of
the float type, about 2e-16 in most cases.
...
# set rcond
if rcond is None:
rcond = len(x)*finfo(x.dtype).eps
Setting rcond
to an appropriately small value for the 32bit example will produce the same results as the 64bit one (e.g. rcond=1e-7
or smaller) .