python-3.x tensorflow keras neural-network forecasting

Out of Sample Forecasting using Neural Network in Keras (Python)

I am doing a time series forecasting exercise using the window method but i am struggling to understand how to do the forecast out of sample. Here is the code:

def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
  dataset = tf.data.Dataset.from_tensor_slices(series)
  dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True)
  dataset = dataset.flat_map(lambda window: window.batch(window_size + 1))
  dataset = dataset.shuffle(shuffle_buffer).map(lambda window: (window[:-1], window[-1]))
  dataset = dataset.batch(batch_size).prefetch(1)
  return dataset

dataset = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size)

The function windowed_dataset split the univariate time series series into a matrix. Imagine, we have a dataset as follows

dataset = tf.data.Dataset.range(10)
for val in dataset:
   print(val.numpy())
0
1
2
3
4
5
6
7
8
9

the windowed_dataset function convert series into windows with x features on the left and y labels on the right.

[2 3 4 5] [6]
[4 5 6 7] [8]
[3 4 5 6] [7]
[1 2 3 4] [5]
[5 6 7 8] [9]
[0 1 2 3] [4]

In the next step, we implement the neural network model on the training dataset as follows:

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_shape=[window_size], activation="relu"),
    tf.keras.layers.Dense(10, activation="relu"), 
    tf.keras.layers.Dense(1)
])

model.compile(loss="mse", optimizer=tf.keras.optimizers.SGD(lr=1e-6, momentum=0.9))
model.fit(dataset,epochs=100,verbose=0)

Up to here, i am fine with the code. However, I am struggling to understand the out of sample forecasting shown below:

forecast = []
for time in range(len(series) - window_size):  
  forecast.append(model.predict(series[time:time + window_size][np.newaxis]))
forecast = forecast[split_time-window_size:]

Can someone please explain to me why are we using a loop here for time in range(len(series) - window_size) ? why not simply do model.predict(dataset_validation) for the validation part and model.predict(dataset) for the training part ?

I don't understand the need for the for loop because this is not a rolling forecast we are not re-training the model. Can someone please explain to me?

While i understand why the data science community structure the dataset this way, i personally find it a lot clearer when we split the X and y and do the model.fit as follows model.fit(X,y,epochs=100,verbose=0) and the predict as as follows model.predict(X)

Solution

The for loop is returning the predictions in order, whereas if you call model.predict(dataset_validation) you'll get the predictions in a shuffled order (assumed you shuffled the dataset).

As for the point of using datasets - it can just help with code organization. There is no need to ever use one if you don't want to.