A Comprehensive Guide to Microsoft’s Swin Transformer

In-depth Explanation and Animations

James Loy
Towards Data Science
7 min readMay 20, 2022


Free to use image from Pexels.


Swin Transformer (Liu et al., 2021) is a transformer-based deep learning model with state-of-the-art performance in vision tasks. Unlike the Vision Transformer (ViT) (Dosovitskiy et al., 2020) which precedes it, Swin Transformer is highly efficient and has greater…

