Overview of tokenization algorithms in NLP
Introduction to tokenization methods, including subword, BPE, WordPiece and SentencePiece
Published in
8 min readAug 12, 2020
⚠️ READ THE ORIGINAL POST IN MY BLOG ⚠️
This article is an overview of tokenization algorithms, ranging from word level, character level and subword level tokenization, with emphasis on BPE…