How to add one point as a feature in an encoder-decoder time series model?

I have been performing a seq2seq time-series prediction using encoder-decoder LSTM architecture. The input data to the model has 2 features, which are essentially two arrays: one is a dependent variable (y-values) and the other, an independent variable (x-values). The shape of the array is:

input_shape: (57, 20, 2)

Where, for example, the x and y-values of one time series are of the shape (1, 20, 2), and their positions in the 3D array being:

x = input_shape[:][:, 0]
y = input_shape[:][:, 1]

I am now faced with a challenge of feeding a point (an x-y timestep, so to speak) as an additional feature. Is there any way to do so?

EDIT: I have added the model that I'm using based on the requests in the comments. It may be noted that the input size I mentioned is small here for reasons of simplicity. The actual input I am using is quite large.

model = Sequential()
model.add(Masking(mask_value=0, input_shape = (input_shape.shape[1], 2)))

model.add(Bidirectional(LSTM(128, dropout=0, return_sequences=True, activation='tanh')))
model.add(Bidirectional(LSTM(128, dropout=0, return_sequences=False)))

model.add((RepeatVector(targets.shape[1])))

model.add(Bidirectional(LSTM(128, dropout=0, return_sequences=True, activation='tanh')))
model.add(Bidirectional(LSTM(128, dropout=0, return_sequences=True)))

model.add(TimeDistributed(Dense(64, activation='relu')))
model.add(TimeDistributed(Dense(1, activation='linear')))
model.build()
model.compile(optimizer=optimizers.Adam(0.00001), loss = 'MAE')

Solution

I would give your model two inputs, where the first input is your normal time series in the shape of (batch,20,2) and a second input of your special time point in the shape (batch,2). Then define the following architecture that repeats your special point 20 times to get (batch,20,2) which is then concatenated with your normal input. (Note i defined target_shape_1 to make sure it compiles on my end, but you can replace it with target.shape[1])

input_shape_1 = 20
target_shape_1 = 3

normal_input = Input(shape=(20, 2), name='normal_inputs') #your normal time series (None,20,2) (Batch,time,feats)

key_time_point = Input(shape=(2),name='key_time_point') #your single special point (None,2) (Batch,feats)

key_time_repeater = RepeatVector(20,name='key_time_repeater') #repeat your special point 20 times 
key_time_repeater_out = key_time_repeater(key_time_point) #turning your (None,2) into (None,20,2)


initial_mask = Masking(mask_value=0, input_shape = (20, 4))

masked_out = initial_mask( 
    #concat your normal input (None,20,2) and repeated input (None,20,2) 
    #into (None, 20,4) and feed to nn
    tf.concat([normal_input,key_time_repeater_out],len(normal_input.shape)-1) 
)

encoder_1 = Bidirectional(LSTM(128, dropout=0, return_sequences=True, activation='tanh'))
encoder_2 = Bidirectional(LSTM(128, dropout=0, return_sequences=False))
encoder_repeat = RepeatVector(target_shape_1)
encoder_out = encoder_repeat(encoder_2(encoder_1(masked_out)))

decoder_1 = Bidirectional(LSTM(128, dropout=0, return_sequences=True, activation='tanh'))
decoder_2 = Bidirectional(LSTM(128, dropout=0, return_sequences=True))
decoder_dense = TimeDistributed(Dense(64, activation='relu'))
decoder_out = decoder_dense(decoder_2(decoder_1(encoder_out)))

final_output = TimeDistributed(Dense(1, activation='linear'))(decoder_out)

model = tf.keras.models.Model(inputs=[normal_input, key_time_point], outputs=final_output)
model.compile(optimizer=tf.keras.optimizers.Adam(0.00001), loss = 'MAE')

A summary() of the model looks like this:

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
key_time_point (InputLayer)     [(None, 2)]          0                                            
__________________________________________________________________________________________________
normal_inputs (InputLayer)      [(None, 20, 2)]      0                                            
__________________________________________________________________________________________________
key_time_repeater (RepeatVector (None, 20, 2)        0           key_time_point[0][0]             
__________________________________________________________________________________________________
tf_op_layer_concat_3 (TensorFlo [(None, 20, 4)]      0           normal_inputs[0][0]              
                                                                 key_time_repeater[0][0]          
__________________________________________________________________________________________________
masking_4 (Masking)             (None, 20, 4)        0           tf_op_layer_concat_3[0][0]       
__________________________________________________________________________________________________
bidirectional_12 (Bidirectional (None, 20, 256)      136192      masking_4[0][0]                  
__________________________________________________________________________________________________
bidirectional_13 (Bidirectional (None, 256)          394240      bidirectional_12[0][0]           
__________________________________________________________________________________________________
repeat_vector_11 (RepeatVector) (None, 3, 256)       0           bidirectional_13[0][0]           
__________________________________________________________________________________________________
bidirectional_14 (Bidirectional (None, 3, 256)       394240      repeat_vector_11[0][0]           
__________________________________________________________________________________________________
bidirectional_15 (Bidirectional (None, 3, 256)       394240      bidirectional_14[0][0]           
__________________________________________________________________________________________________
time_distributed_7 (TimeDistrib (None, 3, 64)        16448       bidirectional_15[0][0]           
__________________________________________________________________________________________________
time_distributed_8 (TimeDistrib (None, 3, 1)         65          time_distributed_7[0][0]         
==================================================================================================
Total params: 1,335,425
Trainable params: 1,335,425
Non-trainable params: 0
__________________________________________________________________________________________________