How to Train BERT

Quick-fire guide to training a transformer

Published in

Towards Data Science

8 min readJun 15, 2021

Form like this requires pretraining — image by author.

The success of transformer models is in large part thanks to the ability to take a model that has been pre-trained on gigantic datasets by the likes of Google and OpenAI — and apply them to our own use-cases.

Sometimes, this is all we need — we take the model and roll with it as is.

How to Train BERT

Quick-fire guide to training a transformer

Written by James Briggs