Search code examples
pythontensorflowkeraslstmrecurrent-neural-network

Using LSTM/RNN to predict a sequence of numbers


I am looking to apply RNN to a fairly simple problem, so as to grasp how it works. I followed this example which demonstrates how to use a LSTM layer to analyse input, and now I'd like to use it for output.

I decided to try to train an RNN to output doubles of an int given as input, up to a cap. So for example, using this data:

def doubles(b,cap): 
    seq = [b]
    if b<=0 :
        raise ValueError('Base int must be greater than zero.')
    i = 1
    while seq[-1]<cap:
        seq.append(b*2**i)
        i +=1
    return seq

maxsize = -1
cap = 100
nums = [2,3,4,6,7,8,9,10,11,12]
doubles = []
for base in nums:
    myseq = doubles(base, cap)
    doubles.append(myseq)
    if len(myseq)>=maxsize:
        maxsize = len(myseq) +1

for s in doubles:
    while len(s)<maxsize:
        s.append(-1)
    print(s)


[2, 4, 8, 16, 32, 64, 128, -1]
[3, 6, 12, 24, 48, 96, 192, -1]
[4, 8, 16, 32, 64, 128, -1, -1]
[6, 12, 24, 48, 96, 192, -1, -1]
[7, 14, 28, 56, 112, -1, -1, -1]
[8, 16, 32, 64, 128, -1, -1, -1]
[9, 18, 36, 72, 144, -1, -1, -1]
[10, 20, 40, 80, 160, -1, -1, -1]
[11, 22, 44, 88, 176, -1, -1, -1]
[12, 24, 48, 96, 192, -1, -1, -1]

I would like to create a keras model that takes nums as inputs and outputs the corresponding sequence, using -1 as a 'STOP' indicator, seeing as I am looking to output only numbers.

I have tried creating a model like this:

mymodel = Sequential()

mymodel.add(Input(shape=(4,)))
mymodel.add(Dense(32))
mymodel.add(LSTM(64))

But it raises this error:

ValueError                                Traceback (most recent call last)
<ipython-input-30-24845ffeabd5> in <module>
      3 mymodel.add(Input(shape=(1,)))
      4 mymodel.add(Dense(32))
----> 5 mymodel.add(LSTM(64))
(...)
ValueError: Input 0 of layer lstm_2 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 32)

What additional dimensions does it require? Am I using these layers incorrectly for wanting to output a "timeseries"?


Solution

  • Managed to figure out my problem. I needed a RepeatVector() layer in my model to integrate the LSTM section properly. My model structure now looks like this:

    mymodel = Sequential()
    
    mymodel.add(Input(shape=(1,)))
    mymodel.add(Dense(8))
    mymodel.add(RepeatVector(8))
    mymodel.add(LSTM(8, activation="relu"))
    
    mymodel.compile(loss="MSE", optimizer="adam")
    history = mymodel.fit(numpy.array(nums), numpy.array(mydoubles), epochs=4000, verbose=0)
    
    

    I am still fiddling with activation functions and optimizers, as I have some vanishing gradients issues, but that is a problem for another post.