Diffusion Transformer Explained

Exploring the architecture that brought transformers into image generation

Mario Larcher
Towards Data Science
12 min readFeb 28, 2024

--

Image generated with DALL·E.

Introduction

After shaking up NLP and moving into computer vision with the Vision Transformer (ViT) and its successors, transformers are now entering the field of image generation. They are gradually becoming an alternative to the U-Net, the…

--

--

I like spaghetti code with 🍅. Currently Staff Applied Scientist in the Canva Image Generation team, formerly Head of Computer Vision at Enel Group.