Search code examples
deep-learningneural-networktime-serieslstmforecasting

Forecasting with LSTM and use of final hidden states


I'm working on a seq2seq, stateless (return_state = False), forecasting problem.

Let's say I have 10 independent time series with dimensions (10,50,2) where 10 is the number of samples, 50 is the timesteps, and 2 is the number of features. If I divide this time series into training with size (10,35,2) and test (10,15,2) where I use test as a validation set for my forecasting (to find how good the trained network forecasts), since the lstm is stateless in this case, the states are flushed (reset) between batches during training (let's say batch size = 1--> train one sample at a time).

My main question is, during prediction for the test set with size (10,15,2), should I use final_hidden_states of my trained network and set it as initial_hidden_states for the test set? The reason for my question is because technically, the test set (10,15,2) is the continuation of (10,35,2) but, I'm not training on the test set.

To be more clear, 1st question is: do I really need the final_hidden_states for forecasting in this case? 2nd questions is: If I need the final_hidden_states, how can I extract them for all 10 samples since the states are flushed internally during training between each batch?

Any help is appreciated. A sample code is highly appreciated (Input and output can be arbitrary or randomly chosen).

I'm currently using keras in python.

Thank you.


Solution

  • I think this should answer your question Post and also clarify how Seq2Seq auto-encoder works.