Search code examples
tensorflowkerasattention-modelencoder-decoderneural-mt

Implemenet attention in vanilla encoder-decoder architecture


I have tried a vanila enc-dec arch as following (english to french NMT)

My encoder-decoder architecture

I want to know how to integrate keras attention layer here. Either from the keras docs or any other attention module from third party repo is also welcome. I just need to integrate it and see how it works and finetune it.

Full code is available here.

Not showing any code in this post because it's large and complex.


Solution

  • Finally I have resolved the issue. I am using a third-party-attention layer by Thushan Ganegedara. Used it's Attentionlayer class. And integrated that in my architecture as following.

    Architecture with attention