Search code examples
pythondata-sciencefacebook-prophet

Facebook Prophet Multivariate Forecasting "Found NaN in column"


I have a multivariate five year dataset, with five other features in addition to y. I have a train test split, where the training set is the first 4 years. I want it to predict the 5th year, so I can compare the results with the actual 5th year data.

I make the future dataframe with make_future_dataframe(periods=365). I pandas.concat the future and train DFs (as pandas.merge cuts off the blank 5th year) future_regressors = pandas.concat([future, train[['f1', 'f2', 'f3', 'f4', 'f5']]], axis=1). Prophet however wants the 5 regressors to have values in the 5th year (which it expresses by complaining about the NaNs from the beginning of the 5th year), but the data doest exist in the training set.

What's the best way to go about this?


Solution

  • Found the answer on this SO post, specifically:

    Note that the additional variables should have values for your future (test) data. If you don't have them, you could start by predicting add1 and add2 with univariate timeseries, and then predict y with add_regressor and the predicted add1 and add2 as future values of the additional variables.