Attention in Neural Networks

Some variations of attention architectures

Mahendran Venkatachalam
Towards Data Science
12 min readJul 7, 2019

--

In an earlier post on “Introduction to Attention” we saw some of the key challenges that were addressed by the attention architecture introduced there (and referred in Fig 1 below). While in the same spirit, there are other variants that you might come across as well. Among other aspects, these variants differ on are “where” attention is used…

--

--