The world’s leading publication for data science, AI, and ML professionals.

When Language Meets Data

Our weekly selection of must-read Editors' Picks and original features

Photo by 4motions Werbeagentur on Unsplash
Photo by 4motions Werbeagentur on Unsplash

From chatbots to sentiment analysis, we’re seeing an explosion of real-world use cases for textual data. Some of the buzziest innovations in AI revolve around models trained with ever-increasing quantities of text; on the flip side, we can trace many of the challenges the field is facing to limited, unrepresentative, or flat-out biased language datasets.

This week, we share six recent posts that cover data and language through a wide range of topics and approaches—NLP fans will have a blast, but so will programmers, data engineers, and AI enthusiasts. Let’s dive in!

  • The wall all large language models run into (for now). GPT-3 and similar generative models can produce text that sounds truthful even when it lacks factuality. Iulia Turc explores the issue of these models’ groundedness – "the ability to ground their statements into reality, or at least attribute them to some external source"—and why it’s been so difficult to develop models that come close to human performance.
  • Natural language querying is making a splash. Up until recently, humans had to invent (and then learn) complex languages in order to communicate with computers and manipulate digital data. Andreas Martinson discusses the emerging world of NLQ—natural language querying—and how it might transform the work of data professionals for the better, as well as democratize access to databases.
  • Choosing the right tools to simplify complex NLP tasks. The difference between clunky and streamlined workflows can sometimes come down to seemingly trivial choices. Kat Li surveys five less-known Python libraries—from Pyspellchecker to Next Word Prediction—and explains how they can save time and effort when used in the right NLP context.

There’s always more to explore on TDS, so we hope you still have some time and stamina for a handful of excellent reads on other topics; we just couldn’t not share these with you.


Thank you, as always, for your passion and curiosity. To support the work we publish, consider sharing your favorite article on Twitter or LinkedIn, telling your Data Science colleagues about us, and/or becoming a Medium member.

Until the next Variable,

TDS Editors


Related Articles

Some areas of this page may shift around if you resize the browser window. Be sure to check heading and document order.