Given a fitted polynomial using the numpy Polynomial convenience class (>1.14)
p = np.polynomial.Polynomial.fit(...)
what is the best way of saving and loading this fit?
There are three arguments which are needed to reproduce the object, namely p.coef
, p.domain
and p.window
.
See this question for what happens when only the coefficients are loaded.
One could save all three as different files and re-load them, but that's cumbersome and annoying.
My two initial ideas would be to save the entire object, either as a 0d-array
np.save("polyfit.npy", p, allow_pickle=True)
np.load("polyfit.npy", allow_pickle=True).item()
or as a 1d-array, as I honestly had to look up .item()
for accessing the element.
np.save("polyfit.npy", [p], allow_pickle=True)
np.load("polyfit.npy", allow_pickle=True)[0]
Both of these ways require using the pickle module, which causes cross-version / platform issues and is a potential security issue. So I'd rather avoid using it.
Are there any recommended ways of doing this?
The documentation doesn't seem to contain any good hint
https://numpy.org/doc/stable/reference/routines.polynomials.html
https://numpy.org/doc/stable/reference/routines.polynomials.classes.html
Make a sample poly series:
In [432]: x=np.linspace(0,1,10); y = x**3
In [433]: p = np.polynomial.Polynomial.fit(x,x**3, 3); p
Out[433]:
Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1., 1.], symbol='x')
So it's a class instance with several attributes, which are arrays and string. And we can make new fitted array with p(x)
:
In [434]: np.allclose(x**3,p(x))
Out[434]: True
As you note we can save and load it, allowing pickle:
In [436]: np.save('test.npy', p); pp = np.load('test.npy',allow_pickle=True)
In [437]: pp
Out[437]:
array(Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1., 1.], symbol='x'),
dtype=object)
This is a single item object dtype array, containing the poly series instance. We get at that instance with .item()
:
In [438]: np.allclose(x**3,pp.item()(x))
Out[438]: True
We can recreate a poly series using the 3 attributes:
In [439]: np.polynomial.Polynomial(p.coef,p.domain, p.window)
Out[439]:
Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1., 1.], symbol='x')
And do the same with the attributes of pp.item()
In [440]: p1=pp.item(); np.polynomial.Polynomial(p1.coef,p1.domain, p1.window)
Out[440]:
Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1., 1.], symbol='x')
An alternate save is with savez
, which saves multiple npy
files in a zip
archive:
In [441]: np.savez('test.npz', **vars(p))
In [442]: dd = np.load('test.npz')
In [443]: list(dd.keys())
Out[443]: ['coef', 'domain', 'window', '_symbol']
dd
isn't a poly series, but the 3 keys can be used to recreate one:
In [444]: np.polynomial.Polynomial(dd['coef'],dd['domain'], dd['window'])
Out[444]:
Polynomial([0.125, 0.375, 0.375, 0.125], domain=[0., 1.], window=[-1., 1.], symbol='x')
This does't use pickle since the 3 keys are arrays.
We don't have to worry about _symbol
, since it's a python string in the original, and an char array in the load:
In [471]: p._symbol, dd['_symbol']
Out[471]: ('x', array('x', dtype='<U1'))