Thoughts and Theory

Do Vision Transformers See Like Convolutional Neural Networks? (Paper Explained)

Akihiro FUJII
Towards Data Science

--

Vision Transformer (ViT) has been gaining momentum in recent years. This article will explain the paper “Do Vision Transformers See Like Convolutional Neural Networks?” (Raghu et al., 2021) published by Google Research and Google Brain, and explore the difference between the conventionally used CNN and Vision Transformer.

The abstract of this paper and the content of this blog

--

--