Search code examples
pythonpandastime-seriesfacebook-prophet

How to perform time series analysis that contains multiple groups in Python using fbProphet or other models?


All,

My dataset looks like following. I am trying to predict the 'amount' for next 6 months using either the fbProphet or other model. But my issue is that I would like to predict amount based on each groups i.e A,B,C,D for next 6 months. I am not sure how to do that in python using fbProphet or other model ? I referenced official page of fbprophet, but the only information I found is that "Prophet" takes two columns only One is "Date" and other is "amount" .

I am new to python, so any help with code explanation is greatly appreciated!

import pandas as pd
data = {'Date':['2017-01-01', '2017-02-01', '2017-03-01', '2017-04-01','2017-05-01','2017-06-01','2017-07-01'],'Group':['A','B','C','D','C','A','B'],
       'Amount':['12.1','13','15','10','12','9.0','5.6']}
df = pd.DataFrame(data)
print (df)

output:

         Date Group Amount
0  2017-01-01     A   12.1
1  2017-02-01     B     13
2  2017-03-01     C     15
3  2017-04-01     D     10
4  2017-05-01     C     12
5  2017-06-01     A    9.0
6  2017-07-01     B    5.6

Solution

  • fbprophet requires two columns ds and y, so you need to first rename the two columns

    df = df.rename(columns={'Date': 'ds', 'Amount':'y'})
    

    Assuming that your groups are independent from each other and you want to get one prediction for each group, you can group the dataframe by "Group" column and run forecast for each group

    from fbprophet import Prophet
    grouped = df.groupby('Group')
    for g in grouped.groups:
        group = grouped.get_group(g)
        m = Prophet()
        m.fit(group)
        future = m.make_future_dataframe(periods=365)
        forecast = m.predict(future)
        print(forecast.tail())
    

    Take note that the input dataframe that you supply in the question is not sufficient for the model because group D only has a single data point. fbprophet's forecast needs at least 2 non-Nan rows.

    EDIT: if you want to merge all predictions into one dataframe, the idea is to name the yhat for each observations differently, do pd.merge() in the loop, and then cherry-pick the columns that you need at the end:

    final = pd.DataFrame()
    for g in grouped.groups:
        group = grouped.get_group(g)
        m = Prophet()
        m.fit(group)
        future = m.make_future_dataframe(periods=365)
        forecast = m.predict(future)    
        forecast = forecast.rename(columns={'yhat': 'yhat_'+g})
        final = pd.merge(final, forecast.set_index('ds'), how='outer', left_index=True, right_index=True)
    
    final = final[['yhat_' + g for g in grouped.groups.keys()]]