Why Are There So Many Tokenization Methods For Transformers?

Five routes to the same destination?

James Briggs
Towards Data Science
5 min readJul 27, 2021

--

Image by author

HuggingFace’s transformers library is the de-facto standard for NLP — used by practitioners worldwide, it’s powerful, flexible, and easy to use. It achieves this through a fairly large (and complex) code-base, which has resulted in the question:

Why are there so many…

--

--