What exactly happens when we fine-tune BERT?
A closer look into some of the recent BERTology research
Published in
6 min readFeb 21, 2022
Google’s BERT was a paradigm shift in natural language modeling, in particular because of the introduction of the pre-training / fine-tuning paradigm: after pre-training in an unsupervised way on a massive amount of text data, the model can be rapidly fine-tuned on a specific downstream task…