The world’s leading publication for data science, AI, and ML professionals.

July Edition: Text Understanding

8 Must-Read Articles

A Practitioner’s Guide to Natural Language Processing

By Dipanjan (DJ) Sarkar – 31 min read.

Unstructured data, especially text, images and videos contain a wealth of information. However, due to the inherent complexity in processing and analyzing this data, people often refrain from spending extra time and effort in venturing out from structured datasets to analyze these unstructured sources of data, which can be a potential gold mine.


How To Create Natural Language Semantic Search For Arbitrary Objects With Deep Learning

By Hamel Husain – 13 min read.

The power of modern search engines is undeniable: you can summon knowledge from the internet at a moment’s notice. Unfortunately, this superpower isn’t omnipresent. There are many situations where search is relegated to strict keyword search, or when the objects aren’t text, search may not be available.


Multi-Class Text Classification with Scikit-Learn

By Susan Li – 11 min read.

There are lots of applications of text classification in the commercial world. For example, news stories are typically organized by topics; content or products are often tagged by categories; users can be classified into cohorts based on how they talk about a product or brand online…


Embedding Machine Learning Models to Web Apps

By Chamin Nalinda – 12 min read.

The best way to learn data science is by doing it, and there’s no other alternative. From this post, I am going to reflect my learning on how I developed a machine learning model, which can classify movies reviews as positive or negative, and how I embed this model to a Python Flask web application.


Who’s Tweeting from the Oval Office?

By Greg Rafferty – 18 min read.

I’ve built a Twitter bot @whosintheoval which retweets each of Donald Trump’s tweets and offers a prediction for whether the tweet was written by Trump himself or by one of his aides. Be sure to follow the bot on Twitter and read on to learn how I built the model!


Trump, in his own words

By Alex P. Miller – 5 min read.

How did Donald Trump’s priorities change over the course of his 2016 presidential campaign? Did he get stronger on immigration as the campaign drew on? Or did he shift his focus to the economy? When did he start talking about Hillary’s emails?


The Variational Autoencoder as a Two-Player Game

By Max Frenzel – 17 min read.

The aim of this article series is to make the basic ideas behind variational autoencoders and the encoding of natural language as accessible as possible, as well as encourage people already familiar with them to view them from a new perspective.


LDA2vec: Word Embeddings in Topic Models

By Lars Hulstaert – 11 min read.

The general goal of a topic model is to produce interpretable document representations which can be used to discover the topics or structure in a collection of unlabelled documents. An example of such an interpretable document representation is: document X is 20% topic a, 40% topic b and 40% topic c.


Related Articles