|MODEL DISTILLATION|AI|LARGE LANGUAGE MODELS|
Teaching is Hard: How to Train Small Models and Outperforming Large Counterparts
Distilling the knowledge of a large model is complex but a new method shows incredible performances
Published in
12 min readNov 11, 2023