Search code examples
keraslstmencoderdecoderencoder-decoder

Encoder-Decoder for Trajectory Prediction


I need to use encoder-decoder structure to predict 2D trajectories. As almost all available tutorials are related to NLP -with sparse vectors-, I couldn't be sure about how to adapt the solutions to a continuous data.

In addition to my ignorance in seqence-to-sequence models, embedding process for words confused me more. I have a dataset that consists of 3,000,000 samples each having x-y coordinates (-1, 1) with 125 observations, which means the shape of each sample is (125, 2). I thought I could think of this as 125 words with 2 dimensional already embedded words, but the encoder and the decoder in this Keras Tutorial expect 3D arrays as (num_pairs, max_english_sentence_length, num_english_characters).

I doubt I need to train each sample (125, 2) separately with this model, as the way Google's search bar does with only one word written.

As far as I understood, an encoder is many-to-one type model and a decoder is one-to-many type model. I need to get a memory state c and a hiddenstate h as vectors(?). Then I should use those vectors as input to decoder and extract predictions in the shape of (x,y) as many as I determine as encoder output.

I'd be so thankful if someone could give an example of an encoder-decoder LSTM architecture over the shape of my dataset, especially in terms of dimensions required for encoder-decoder inputs and outputs, particulary on Keras model if possible.


Solution

  • I assume you want to forecast 50 time steps with the 125 previous ones (as an example). I give you the most basic Encoder-Decoder Structure for time Series but it can be improved (with Luong Attention for instance).

    from tensorflow.keras import layers,models
    
    input_timesteps=125
    input_features=2
    output_timesteps=50
    output_features=2
    units=100
    
    #Input
    encoder_inputs = layers.Input(shape=(input_timesteps,input_features))
    
    #Encoder
    encoder = layers.LSTM(units, return_state=True, return_sequences=False)
    encoder_outputs, state_h, state_c = encoder(encoder_inputs) # because return_sequences=False => encoder_outputs=state_h
    
    #Decoder
    decoder = layers.RepeatVector(output_timesteps)(state_h)
    decoder_lstm = layers.LSTM(units, return_sequences=True, return_state=False)
    decoder = decoder_lstm(decoder, initial_state=[state_h, state_c])
    
    
    #Output
    out = layers.TimeDistributed(Dense(output_features))(decoder)
    
    model = models.Model(encoder_inputs, out)
    

    So the core idea here is :

    1. Encode the time series into two states : state_h and state_c. Check this to understand the work of LSTM cells.
    2. Repeat state_h the number of time steps you want to forecast
    3. Decode using an LSTM with initial states calculated by the encoder
    4. Use a Dense layer to shape the number of needed features for each time steps

    I advise you to test our achtecture and visualize them with model.summary() and tf.keras.utils.plot_model(mode,show_shapes=True). It gives you good representations like, for the summary :

    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_5 (InputLayer)            [(None, 125, 2)]     0                                            
    __________________________________________________________________________________________________
    lstm_8 (LSTM)                   [(None, 100), (None, 41200       input_5[0][0]                    
    __________________________________________________________________________________________________
    repeat_vector_4 (RepeatVector)  (None, 50, 100)      0           lstm_8[0][1]                     
    __________________________________________________________________________________________________
    lstm_9 (LSTM)                   (None, 50, 100)      80400       repeat_vector_4[0][0]            
                                                                     lstm_8[0][1]                     
                                                                     lstm_8[0][2]                     
    __________________________________________________________________________________________________
    time_distributed_4 (TimeDistrib (None, 50, 2)        202         lstm_9[0][0]                     
    ==================================================================================================
    Total params: 121,802
    Trainable params: 121,802
    Non-trainable params: 0
    __________________________________________________________________________________________________
    

    and the model plotted :

    enter image description here