Im currently developing a model for time series prediction using LSTM cells with tensorflow. My model is similar to the ptb_word_lm. It works, but I'm not sure how to understand the number of steps back parameter when using truncated backpropagation through time (the parameter is called num_steps
in the example).
As far as I understand, the model parameters are updated after every num_steps
steps. But does that also mean that the model does not recognize dependencies that are farther away than num_steps
. I think it should because the internal state should capture them. But then which effect has a large/small num_steps
value.
The num_steps in the ptb_word_lm example shows the sequence length or the num of words to be processed for predicting the next word.
For example if you have a sentence.
"Widows and orphans occur when the first line of a paragraph is the last line in a column or page, or when the last line of a paragraph is the first line of a new column or page."
If u say num_steps = 5
then you mean for
input = "Widows and orphans occur when"
output = "and orphans occur when the"
i.e given the words ("Widows","and","orphans","occur","when"), you are trying to predict the occurrence of the word ("the").
so, the num_steps actually plays a important role in remembering the larger context(i.e the given words) for predicting the probability of the next word
Hope, this is helpful..