Distilling Transformers: (DeiT) Data-efficient Image Transformers
Published in
4 min readJan 24, 2021
transformers go brum brum
Hi guys! Today we are going to implement Training data-efficient image transformers & distillation through attention a new method to perform knowledge distillation on Vision Transformers called DeiT.
You will soon see how elegant and simple this new approach is.