tensorflow backpropagation lstm recurrent-neural-network

Influence of number of steps back when using truncated backpropagation through time

Im currently developing a model for time series prediction using LSTM cells with tensorflow. My model is similar to the ptb_word_lm. It works, but I'm not sure how to understand the number of steps back parameter when using truncated backpropagation through time (the parameter is called num_steps in the example).

As far as I understand, the model parameters are updated after every num_steps steps. But does that also mean that the model does not recognize dependencies that are farther away than num_steps. I think it should because the internal state should capture them. But then which effect has a large/small num_steps value.

Solution

The num_steps in the ptb_word_lm example shows the sequence length or the num of words to be processed for predicting the next word.

For example if you have a sentence.

"Widows and orphans occur when the first line of a paragraph is the last line in a column or page, or when the last line of a paragraph is the first line of a new column or page."

If u say num_steps = 5

then you mean for

input = "Widows and orphans occur when"

output = "and orphans occur when the"

i.e given the words ("Widows","and","orphans","occur","when"), you are trying to predict the occurrence of the word ("the").

so, the num_steps actually plays a important role in remembering the larger context(i.e the given words) for predicting the probability of the next word

Hope, this is helpful..