tensorflow keras attention-model encoder-decoder neural-mt

Implemenet attention in vanilla encoder-decoder architecture

I have tried a vanila enc-dec arch as following (english to french NMT)

I want to know how to integrate keras attention layer here. Either from the keras docs or any other attention module from third party repo is also welcome. I just need to integrate it and see how it works and finetune it.

Full code is available here.

Not showing any code in this post because it's large and complex.

Solution

Finally I have resolved the issue. I am using a third-party-attention layer by Thushan Ganegedara. Used it's Attentionlayer class. And integrated that in my architecture as following.

Accumulating output from a graph using tf.while_loop
No module named 'tensorflow.keras.layers.experimental.preprocessing'
How to add InstanceNormalization on Tensorflow/keras
TensorFlow - ValueError: Argument `output` must have rank (ndim) `target.ndim - 1`
Why is my loss function increasing with each epoch?
Error while exporting Tensorflow model to CoreML
How LSTM works for dataset by the shape (batch_size, num_word, dim)?
Tensorflow 2.13.1, no matching distribution found for tensorflow-text 2.13.0
What does the keras ConvLSTM2D layer do?
How to fix "module 'tensorflow' has no attribute 'estimator' " error
Tensorflow Error: "Cannot parse tensor from proto"
Data Augmentation Layer in Keras Sequential Model
Tensorflow and Keras: OSError: unable to create model file, permission denied
Model.evaluate() returning a float, not a list
Can I use TensorBoard with Google Colab?
Module 'tensorflow' has no attribute 'contrib'
How to choose the number of hidden layers and nodes?
How can I correct an input_shape error produced when following the "Overfit and underfit" TensorFlow tutorial?
Parallelize DeepFace on multiple GPUs
Bayesian Linear Regression with Tensorflow Probability
Making predictions on live video feed using React Native and Tensorflow.js
difference between categorical and binary cross entropy
Why do I get ValueError: Unrecognized data type: x=[...] (of type <class 'list'>) with model.fit() in TensorFlow?
A KerasTensor cannot be used as input to a TensorFlow function
tensorflow keras Model.fit returning: ValueError: Unrecognized data type
How to utilize the .experimental module in tensorflow without generating attribute error
Why can GPU do matrix multiplication faster than CPU?
Is there a Python library where I can import a gradient descent function/method?
Broadcasting multiple version of X_data that pairing with same y_data
Why is Keras LSTM on CPU three times faster than GPU?