Search code examples
pythonkeraslstmdeep-residual-networks

Modifying residual LSTM


I found som code for a Residual LSTM here: https://gist.github.com/bzamecnik/8ed16e361a0a6e80e2a4a259222f101e

I have been using an LSTM for timeseries classification with a 3d input (sample,timestep,features) and a single output. I would be interested in trying out the residual model on my data, but what I need is one single output with sigmoid activation. Anyone understand how to do that? The current model seems to return 10 outputs (the number of features in the input data).

def make_residual_lstm_layers(input, rnn_width, rnn_depth, rnn_dropout):
    """
    The intermediate LSTM layers return sequences, while the last returns a single element.
    The input is also a sequence. In order to match the shape of input and output of the LSTM
    to sum them we can do it only for all layers but the last.
    """
    x = input
    for i in range(rnn_depth):
        return_sequences = i < rnn_depth - 1
        x_rnn = LSTM(rnn_width, recurrent_dropout=rnn_dropout, dropout=rnn_dropout, return_sequences=return_sequences)(x)
        if return_sequences:
            # Intermediate layers return sequences, input is also a sequence.
            if i > 0 or input.shape[-1] == rnn_width:
                x = add([x, x_rnn])
            else:
                # Note that the input size and RNN output has to match, due to the sum operation.
                # If we want different rnn_width, we'd have to perform the sum from layer 2 on.
                x = x_rnn
        else:
            # Last layer does not return sequences, just the last element
            # so we select only the last element of the previous output.
            def slice_last(x):
                return x[..., -1, :]

            x = add([Lambda(slice_last)(x), x_rnn])
    return x

input = Input(shape=(32, 10))
output = make_residual_lstm_layers(input, rnn_width=10, rnn_depth=8, rnn_dropout=0.2)
model = Model(inputs=input, outputs=output)
model.summary()

This part: model.compile(loss='binary_crossentropy', optimizer='adam') I was able to add like so:

model = Model(inputs=input, outputs=output)
model.compile(loss='binary_crossentropy', optimizer='adam')
model.summary()

But what I need i something like this:

input = Input(shape=(32, 10))
output = make_residual_lstm_layers(input, rnn_width=10, rnn_depth=8, rnn_dropout=0.2)
newoutput = Dense(1, activation='sigmoid')(output)
model = Model(inputs=input, outputs=newoutput)
model.compile(loss='binary_crossentropy', optimizer='adam')
model.summary()

Anyone have an idea how to modify the model to accomplish this?


Solution

  • the main problem is that the feature dimensions don't match (10 != 1) so it's not possible to apply a skip connection in the last part. here is my proposal where I substitute the last block with a simple LSTM layer with 1 output and a sigmoid activation

    def make_residual_lstm_layers(input, rnn_width, rnn_depth, rnn_dropout):
    
        x = input
        for i in range(rnn_depth):
            
            return_sequences = i < rnn_depth - 1
            x_rnn = LSTM(rnn_width, recurrent_dropout=rnn_dropout, dropout=rnn_dropout, 
                         return_sequences=return_sequences)(x)
            
            if return_sequences:
                
                if i > 0 or input.shape[-1] == rnn_width:
                    x = add([x, x_rnn])
                else:
                    x = x_rnn
            else:
                
                x = LSTM(1, activation='sigmoid', 
                         recurrent_dropout=rnn_dropout, dropout=rnn_dropout, 
                         return_sequences=return_sequences)(x)            
        return x
    
    input = Input(shape=(32, 10))
    output = make_residual_lstm_layers(input, rnn_width=10, rnn_depth=8, rnn_dropout=0.2)
    model = Model(inputs=input, outputs=output)
    model.summary()