How to Train BERT

Quick-fire guide to training a transformer

James Briggs
Towards Data Science
8 min readJun 15, 2021

--

Form like this requires pretraining — image by author.

The success of transformer models is in large part thanks to the ability to take a model that has been pre-trained on gigantic datasets by the likes of Google and OpenAI — and apply them to our own use-cases.

Sometimes, this is all we need — we take the model and roll with it as is.

--

--