Search code examples
tensorflowkeraslstmautoencoderseq2seq

How to use TimeDistributed layer for predicting sequences of dynamic length? PYTHON 3


So I am trying to build an LSTM based autoencoder, which I want to use for the time series data. These are spitted up to sequences of different lengths. Input to the model has thus shape [None, None, n_features], where the first None stands for number of samples and the second for time_steps of the sequence. The sequences are processed by LSTM with argument return_sequences = False, coded dimension is then recreated by function RepeatVector and ran through LSTM again. In the end I would like to use the TimeDistributed layer, but how to tell python that the time_steps dimension is dynamic? See my code:

from keras import backend as K  
.... other dependencies .....
input_ae = Input(shape=(None, 2))  # shape: time_steps, n_features
LSTM1 = LSTM(units=128, return_sequences=False)(input_ae)
code = RepeatVector(n=K.shape(input_ae)[1])(LSTM1) # bottleneck layer
LSTM2 = LSTM(units=128, return_sequences=True)(code)
output = TimeDistributed(Dense(units=2))(LSTM2) # ???????  HOW TO ????

# no problem here so far: 
model = Model(input_ae, outputs=output) 
model.compile(optimizer='adam', loss='mse')

Solution

  • this function seems to do the trick

    def repeat(x_inp):
    
        x, inp = x_inp
        x = tf.expand_dims(x, 1)
        x = tf.repeat(x, [tf.shape(inp)[1]], axis=1)
    
        return x
    

    example

    input_ae = Input(shape=(None, 2))
    LSTM1 = LSTM(units=128, return_sequences=False)(input_ae)
    code = Lambda(repeat)([LSTM1, input_ae])
    LSTM2 = LSTM(units=128, return_sequences=True)(code)
    output = TimeDistributed(Dense(units=2))(LSTM2)
    
    model = Model(input_ae, output) 
    model.compile(optimizer='adam', loss='mse')
    
    X = np.random.uniform(0,1, (100,30,2))
    model.fit(X, X, epochs=5)
    

    I'm using tf.keras with TF 2.2