Search code examples
pandasnumpymatplotlibtrendlinedtype

Why am I getting the "ValueError: data type <class 'numpy.object_'> not inexact." while using polyfit function?


I am trying to plot a trendline for my data. However, I am getting the error

ValueError: data type <class 'numpy.object_'> not inexact.  

Can someone explain why?

my dataframe is Us_corr3;

enter image description here
Here is my code:

data5 = Us_corr3[['US GDP', 'US Unemployment']]

x = data5['US GDP']

y = data5['US Unemployment']

plt.scatter(x, y)


z = np.polyfit(x, y, 1)

p = np.poly1d(z)

plt.plot(x,p(x),"r--")

plt.show()

And it says;

ValueError: data type <class 'numpy.object_'> not inexact.

Solution

  • If x, the array derived from your Series is object dtype, it produces your error:

    In [67]: np.polyfit(np.arange(3).astype(object),np.arange(3),1)                                      
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-67-787351a47e03> in <module>
    ----> 1 np.polyfit(np.arange(3).astype(object),np.arange(3),1)
    
    <__array_function__ internals> in polyfit(*args, **kwargs)
    
    /usr/local/lib/python3.6/dist-packages/numpy/lib/polynomial.py in polyfit(x, y, deg, rcond, full, w, cov)
        605     # set rcond
        606     if rcond is None:
    --> 607         rcond = len(x)*finfo(x.dtype).eps
        608 
        609     # set up least squares equation for powers of x
    
    /usr/local/lib/python3.6/dist-packages/numpy/core/getlimits.py in __new__(cls, dtype)
        380             dtype = newdtype
        381         if not issubclass(dtype, numeric.inexact):
    --> 382             raise ValueError("data type %r not inexact" % (dtype))
        383         obj = cls._finfo_cache.get(dtype, None)
        384         if obj is not None:
    
    ValueError: data type <class 'numpy.object_'> not inexact
    

    Functions like this expect numeric dtype arrays. Cleanup your dataframe first!