To Distil or Not To Distil: BERT, RoBERTa, and XLNet
Transformers are the undisputed kings of Natural Language Processing. But with so many different models around it can be tough to choose just one. Hopefully, this will help!
Published in
8 min readFeb 7, 2020
This is becoming a bit of a cliche, but Transformer models have transformed Natural…