An intuitive explanation of Self Attention
A step-by-step explanation of the multi-headed self-attention block
Published in
10 min readOct 7, 2020
In this article, I am going to explain everything you need to know about self-attention.
What do transformer neural networks contain that make them so much more powerful and better performing than regular recurrent neural networks?