Search code examples
tensorflowdeep-learningnlpattention-model

What is the difference between Luong attention and Bahdanau attention?


These two attentions are used in seq2seq modules. The two different attentions are introduced as multiplicative and additive attentions in this TensorFlow documentation. What is the difference?


Solution

  • They are very well explained in a PyTorch seq2seq tutorial.

    The main difference is how to score similarities between the current decoder input and encoder outputs.