Search code examples
python-3.xargumentstypeerrorfacebook-prophet

Python Prophet TypeError: arg must be a list, tuple, 1-d array, or Series


I am trying to use Prophet to forecast Lululemon's stock prices. However, I am encountering the following error when fitting the model:

TypeError                                 Traceback (most recent call last)
Cell In[3], line 17
     15 # Fit the data to a Prophet model
     16 model = Prophet()
---> 17 model.fit(lululemon_data)
     19 # Create a dataframe to hold predictions for the next 5 years
     20 future = model.make_future_dataframe(periods=5*365)
TypeError: arg must be a list, tuple, 1-d array, or Series

Here is my code:

python

import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from prophet import Prophet

# Download Lululemon stock data
ticker = 'LULU'
lululemon_data = yf.download(ticker, start='2007-07-27')  # Lululemon IPO date

# Prepare the data for Prophet
lululemon_data.reset_index(inplace=True)
lululemon_data = lululemon_data[['Date', 'Close']]
lululemon_data.rename(columns={'Date': 'ds', 'Close': 'y'}, inplace=True)

# Fit the data to a Prophet model
model = Prophet()
model.fit(lululemon_data)

# Create a dataframe to hold predictions for the next 5 years
future = model.make_future_dataframe(periods=5*365)

# Make predictions
forecast = model.predict(future)

# Plot the forecast
fig = model.plot(forecast)
plt.title('Lululemon Stock Price Forecast for Next 5 Years')
plt.xlabel('Date')
plt.ylabel('Closing Price (USD)')
plt.show()

I suspect the issue lies in the structure of the lululemon_data DataFrame because the code works correctly when I use the example data provided by Prophet:

python

df = pd.read_csv('https://raw.githubusercontent.com/facebook/prophet/main/examples/example_wp_log_peyton_manning.csv')
df.head()

I've tried to ensure the column names are correctly renamed to ds and y.


Solution

  • Unexpert opinion here after fiddling around with your code and chatgpt.

    For your particular setup, there is some difference in the .columns between the data frames.

    >>> df.columns
    Index(['ds', 'y'], dtype='object')
    >>> ld.columns  # lululemon_data
    MultiIndex([('ds',     ''),
                ( 'y', 'LULU')],
               names=['Price', 'Ticker'])
    

    I am not sure what the exact difference is, but it's probably used by yfinance to differentiate between multiple tickers, and is confusing Prophet.

    It is possible to disable the multi level index for yfinance.download. From the documentation for yfinance.download,

    multi_level_index: bool

    Optional. Always return a MultiIndex DataFrame? Default is True

    I toggled multi_level_index to False, which seems to fix your code.

    lululemon_data = yf.download(ticker, start='2007-07-27', multi_level_index=False)  # Lululemon IPO date
    

    For other cases where getting the underlying library cannot output an applicable type, maybe this thread can help to flatten a MultiIndex.