Battle of the Transformers: ELECTRA, BERT, RoBERTa, or XLNet

ELECTRA is the new kid on the block. Let’s take a look at how it stacks up against the old guard!

Thilina Rajapakse
Towards Data Science
8 min readMay 9, 2020


Image by 272447 from Pixabay

One of the “secrets” behind the success of Transformer models is the technique of Transfer Learning. In Transfer Learning, a model (in our case, a Transformer model) is pre-trained on a gigantic…

