Search code examples
python-3.xneural-networklstmtflearn

4 Questions to lstm network for sentence generation


Warning: I am a Deep learning noob

I train my two layer Lstm-model on a dataset of jokes (231657 jokes) and want to know 4 things:

  1. I train it now on 50 chars per sentence if I want it to generate new jokes do I need to input 50 chars first or can I randomly pic one char to start the sentence/joke?

  2. Is it not usefull to train it on only 50 chars for 1.8 mio. in total (vector is [10800001, 50, 1]) or is that good?

  3. I used a class were I init my model so I can call it, unfortunately If I want to create a long sentence/mulitple senteces I have to call my predict statement more than once, the problem is that my predict statement init the model first and then predict the value, so I have to use tf.reset_default_graph(), but after a while, it takes longer. So what should I do to prevent this problem? Should I maybe init the model in the main script or something like this?

  4. How to solve the problem with growing text? I currently take the shape of the input and use it for my model initialization in my class, but is this a good idea?

Solution

    1. You need to start by inputting a seed sequence of 50 characters.
    2. I'd suggest you to increase the sequence length.
    3. I don't understand you very well but I suggest you to structure your model properly. Read this for more: https://danijar.com/structuring-your-tensorflow-models/
    4. Again, I suggest you to read the above link.

    It's not always necessary to make your model as a class. You can just make the model once in procedural way, train it and then save it using tf.Saver()