Search code examples
javaneural-networkrecurrent-neural-networkencog

Recurrent Neural Network Text Generator


I'm very new to neural networks, and I'm trying to make an Elman RNN which generates text. I'm using Encog in Java. No matter what I feed the network, it takes a very long time to train, and it always falls into a repeating sequence of characters. I'm sort of new to neural networks, so I just want to make sure I have the concept correct. I'm not going to bother with sharing code because Encog does all the hard stuff anyway.

The way I'm training the network is I'm making a data pair for every character in the training data, in which the input is the character, and the output is the next character. All of that is in one training set. That's pretty much all that I had to write, because Encog handles everything else. Then I just feed a character into the network and it returns a character, then feed that one in, and the next one and so on. I'm assuming people usually have an end character or something so that the network tells you when to stop, but I just make it stop at 1000 characters to get a good sample of text. I know that Elman networks are supposed to have context nodes, but I think Encog is handling that for me. The context nodes must be doing something because the same character doesn't always have the same output.


Solution

  • In a small dataset, RNNs perform poorly since it has to learn everything from start. So if you have a small dataset (usually a training set less than 10 million characters is considered as small) then gather more data. If you're getting a repeating sequence of characters, it okay. All you need to do is to train longer.

    One more suggestion is to switch from character level to word level model. You will be able to generate less gibberish output from it.