Learning NLP Language Models with Real Data

Published in

Towards Data Science

11 min readJan 26, 2019

Language Models (LMs) estimate the relative likelihood of different phrases and are useful in many different Natural Language Processing applications (NLP). For example, they have been used in Twitter Bots for ‘robot’ accounts to form their own sentences.

In this post, we will first formally define LMs and then demonstrate how they can be computed with real data. All the methods shown are demonstrated fully with code in the following Kaggle notebook.

https://www.kaggle.com/osbornep/education-learning-language-models-with-real-data

Learning NLP Language Models with Real Data

Written by Philip Osborne, PhD Researcher