XLM: Cross-Lingual Language Model
Understanding Transformer-Based Self-Supervised Architectures
Published in
5 min readSep 13, 2020
Models like BERT (Devlin et. al.) or GPT (Radford et. al.) have achieved the state of the art in language understanding. However, these models are pre-trained only on one language. Recently, efforts have been made towards mitigating monolingual representations and building universal cross-lingual models…