Search code examples
tensorflowbackpropagationlstmrecurrent-neural-network

Influence of number of steps back when using truncated backpropagation through time


Im currently developing a model for time series prediction using LSTM cells with tensorflow. My model is similar to the ptb_word_lm. It works, but I'm not sure how to understand the number of steps back parameter when using truncated backpropagation through time (the parameter is called num_steps in the example).

As far as I understand, the model parameters are updated after every num_steps steps. But does that also mean that the model does not recognize dependencies that are farther away than num_steps. I think it should because the internal state should capture them. But then which effect has a large/small num_steps value.


Solution

  • The num_steps in the ptb_word_lm example shows the sequence length or the num of words to be processed for predicting the next word.

    For example if you have a sentence.

    "Widows and orphans occur when the first line of a paragraph is the last line in a column or page, or when the last line of a paragraph is the first line of a new column or page."

    If u say num_steps = 5

    then you mean for

    input = "Widows and orphans occur when"

    output = "and orphans occur when the"

    i.e given the words ("Widows","and","orphans","occur","when"), you are trying to predict the occurrence of the word ("the").

    so, the num_steps actually plays a important role in remembering the larger context(i.e the given words) for predicting the probability of the next word

    Hope, this is helpful..