The Evolution of Tokenization in NLP — Byte Pair Encoding in NLP

Introduction to different tokenization algorithms

Harshit Tyagi
Towards Data Science
11 min readOct 5, 2021

--

Image by Author

NLP may have been a little late to the AI epiphany but it is doing wonders with organizations like Google, OpenAI releasing state-of-the-art(SOTA) language models like BERT and GPT-2/3 respectively.

--

--