Search code examples
pythonpandasmatplotlibmachine-learningarima

Getting error "TypeError: no numeric data to plot" in a Time Series ARIMA analysis


I am trying to follow a tutorial whereby an ARIMA time series analysis using differenced data is being done:

The following is the python code:

def difference(dataset):
    diff = list()
    for i in range(1, len(dataset)):
        value = dataset[i] - dataset[i - 1]
        diff.append(value)
    return Series(diff)

series = pd.read_csv('dataset.csv')
X = series.values  # The error in building the list can be seen here
X = X.astype('float32')
stationary = difference(X)
stationary.index = series.index[1:]
...
stationary.plot()
pyplot.show()

When the process reaches the plotting stage I get the error:

TypeError: no numeric data to plot

Tracing back, I find that the data that is being parsed is resulting in a collection of array. Saving the collection stationary as *.csv file gives me a list like:

[11.]
[0.]
[16.]
[45.]
[27.]
[-141.]
[46.]

Can somebody tell me what is going wrong here?

PS. I have exluded the parts of import of libraries

Edit 1

A section of the dataset is reproduced below:

Year,Obs
1994,21
1995,62
1996,56
1997,29
1998,38
1999,201

Solution

  • To difference, just use Series.diff or DataFrame.diff().

    Also, Year should be the index:

    from matplotlib.ticker import MaxNLocator
    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(
        {'Year': np.arange(1994, 2000),
         'Obs': [21, 62, 56, 29, 38, 201]})
    
    stationary = df.set_index('Year').diff()
    ax = stationary.plot(legend=False)
    ax.xaxis.set_major_locator(MaxNLocator(integer=True))
    

    Output:

    enter image description here