Search code examples
pythontime-seriesfacebook-prophettrain-test-splitholtwinters

splitting the training data based on date column


I have weekly time series data and I want to test multiple time series on it.

raw data looks like

date    visits
1/22/2021   796105
1/29/2021   742833
2/5/2021    918413
2/12/2021   806033
.
.
.
9/23/2022   3610023
9/30/2022   2833338

I want to split training data into multiple data frames and forecast the next 12 weeks always

for example :

train_1 = data until 15-jan-2022
test_1 = next 12 weeks

train_2 = data until 15-feb-2022
test_2 = next 12 weeks
.
.
train_x = data until 15-jul-2022
test_x = next 12 weeks

Later I want to have a for loop for my holt Winter forecasting algorithm. I looked at https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html but couldn't understand on my dataset

Can someone help? Thank you in advance! .


Solution

  • Depending upon the range we can set as follows : -

    train_and_test = []
    train_and_test.append((df.iloc[:52, 0], df.iloc[52:62, 0]))
    train_and_test.append((df.iloc[:56, 0], df.iloc[56:66, 0]))
    train_and_test.append((df.iloc[:60, 0], df.iloc[60:70, 0]))
    train_and_test.append((df.iloc[:65, 0], df.iloc[65:75, 0]))
    train_and_test.append((df.iloc[:69, 0], df.iloc[69:79, 0]))
    train_and_test.append((df.iloc[:73, 0], df.iloc[73:83, 0]))
    train_and_test.append((df.iloc[:78, 0], df.iloc[78:88, 0]))
    train_and_test.append((df.iloc[:82, 0], df.iloc[82:90, 0]))
    
    for train_df ,test_df in train_and_test:
      .....