THOUGHTS AND THEORY

Google’s RFA: Approximating Softmax Attention Mechanism in Transformers

What is Attention Mechanism & Why is RFA better than Softmax?

Louis Chan

Published in

Towards Data Science

7 min readFeb 27, 2021

Google has recently released a new approach — Random Feature Attention — to replace softmax attention mechanisms in transformers for achieving similar or better performance with…

THOUGHTS AND THEORY

Google’s RFA: Approximating Softmax Attention Mechanism in Transformers

What is Attention Mechanism & Why is RFA better than Softmax?

Written by Louis Chan