Hands-on Tutorials

How to Build a WordPiece Tokenizer For BERT

Easy guide to building a BertTokenizer from scratch

James Briggs
Towards Data Science
5 min readSep 14, 2021

--

Image by author

Building a transformer model from scratch can often be the only option for many more specific use cases. Although BERT and other transformer models have been pre-trained for many languages and domains, they do not cover everything.

--

--