Search code examples
deep-learningrecurrent-neural-networkattention-modelfeed-forward

Can the attentional mechanism be applied to structures like feedforward neural networks?


Recently, I have learned decoder-encoder network and attention mechanism, and found that many papers and blogs implement attention mechanism on RNN network.

I am interested if other networks can incorporate attentional mechanisms.For example, the encoder is a feedforward neural network and decoder is an RNN. Can feedforward neural networks without time series use attentional mechanisms? If you can, please give me some suggestions.Thank you in advance!


Solution

  • In general Feed forward networks treat features as independent; convolutional networks focus on relative location and proximity; RNNs and LSTMs have memory limitations and tend to read in one direction.

    In contrast to these, attention and the transformer can grab context about a word from distant parts of a sentence, both earlier and later than the word appears, in order to encode information to help us understand the word and its role in the system called sentence.

    There is a good model for feed-forward network with attention mechanism here:

    https://arxiv.org/pdf/1512.08756.pdf

    hope to be useful.