Search code examples

Keras LSTM training for text generation

I am working on a character level text generator using Keras. In going through examples/tutorials there is something that I still do not understand.

The training data (X) is being split into semi redundant sequences of length maxlen, with y being the character immediately following the sequence.

I understand that this is for efficiency as it means that the training will only realize dependencies within maxlen characters.

I am struggling to understand why it is done in sequences though. I thought LSTM/RNN were trained by inputting characters one at a time and comparing the predicted next character to the actual next character. This seems very different then inputting them say maxlen=50 characters at a time and comparing length 50 sequences to the next character.

Does Keras actually break up the training sequences and input them character by character "under the hood"?

If not why?


  • Because of sequence generation I'm assuming that you are setting the flag stateful=True in your recurrent layers. Without this option you are making different sequences / characters independent what I think is not the case. If this flag is set to True then both of this approaches are equivalent - and dividing the text into sequences is made for improvement of performance and simplicity reason.