The world’s leading publication for data science, AI, and ML professionals.

April Edition: Adventures in Topic Modelling

Beyond sentiment analysis

MONTHLY EDITION

Photo by Suzy Hazelwood from Pexels
Photo by Suzy Hazelwood from Pexels

Computers are great at working with structured data like spreadsheets and database tables. But as humans mostly communicate using language and words, that’s unfortunate for computers. A lot of information in the world is unstructured – for example, raw text in English or another language. How can we get a computer to understand unstructured text and extract information from it?

Natural Language Processing (NLP) is the sub-field of AI that is focused on enabling computers to understand and process human languages. If you’re new to data science, you’ll see that there’s an abundance of material out there covering all kinds of NLP-related tasks. The most common NLP blog posts that I’ve seen are related to sentiment analysis. That is, detecting whether a piece of text expresses a positive or negative sentiment. But there are many more NLP problems that exist.

I’d like to draw your attention to topic modelling, a field within NLP that I’ve recently started taking a serious interest in. Topic modelling identifies latent patterns of word occurrences using the distribution of words in a collection of documents. The output is a set of topics consisting of clusters of words that co-occur in these documents according to certain patterns.

Why do I think topic modelling is interesting? Because these days more than ever, it’s not only about how we feel, but it’s also about what is being said. In combination, sentiment analysis and topic modelling can be used to perform what’s called aspect-based sentiment analysis, where the goal is to extract both the entity described in the text and the sentiment expressed towards such entities.

For businesses, the advantages gained from exploring how customers are reacting towards particular parts of your service or product can help support business use cases, including product development and quality control, communications, customer support, and decision-making processes. This is much more information than just knowing whether your customers are happy or unhappy, and it can help support the constant development and improvement in your business.

Lowri Williams, Editorial Associate at Towards Data Science


Interactive Topic Modeling with BERTopic

An in-depth guide to topic modeling with BERTopic

By Maarten Grootendorst – 7 min read


Topic Modeling Articles with NMF

Extracting topics is a good unsupervised data-mining technique to discover the underlying relationships between texts.

By Rob Salgado – 12 min read


Topic Modeling Tutorial with Latent Dirichlet Allocation (LDA)

A practical guide with proven hands-on Python code. Find what people are tweeting about.

By Michel Kana, Ph.D – 5 min read


Introduction to Topic Modeling using Scikit-Learn

Explore 3 unsupervised techniques to extract important topics from documents

By Ng Wai Foong – 10 min read


Understanding NLP and Topic Modeling Part 1

We Explore How Extracting Topics Via NLP Helps Us Data Science Better

By Tony Yiu – 8 min read


Topic Modeling in Power BI using PyCaret

In this post, we will see how we can implement topic modeling in Power BI using PyCaret.

By Moez Ali – 7 min read


Topic Modelling: Going Beyond Token Outputs

An investigation into how to assign topics with meaningful titles

By Lowri Williams – 9 min read


Topic modelling with PLSA

PLSA or Probabilistic Latent Semantic Analysis is a technique used to model information under a probabilistic framework.

By Dhruvil Karani – 5 min read


Sentiment Analysis: Aspect-Based Opinion Mining

An investigation into sentiment analysis and topic modelling techniques.

By Lowri Williams – 8 min read


Topic Modelling in Python with NLTK and Gensim

In this post, we will learn how to identify which topic is discussed in a document, called topic modelling.

By Susan Li – 6 min read


New podcasts


We also thank all the great new writers who joined us recently: Vivienne DiFrancesco, Monica Indrawan, Ouaguenouni Mohamed, Layne Sadler, Kendric Ng, Soroush Safaei, Alexandra Souly, Gant Laborde, abhi saini, Eden Molina, Wojtek Pyrak, Bora Tunca, Sam Ansari, Mahmoud Harmouch, Ajay Arunachalam, Maxim Ziatdinov, Sajjad Shumaly, Juan Samuel, Serhii Pospielov, Fernando Carrillo, Yann Morize, Sebastian Carino, Peng Yan, Paul Brunzema, Anders Borges, Ben Bogart, Xiao-Yang Liu, Alex Wagner, Michele Cavazza, Dimitris Dais, Julian Hatzky, Evans Doe Ocansey, Prajwalan Karanjit, Iqbal Ali, Stefan Hrouda-Rasmussen, Mike Casale, Maham Faisal Khan, Zainul Arifin, Silja Voolma, Ph.D., Will Nobles, Ben Santos, Mai Stafford, and many others. We invite you to take a look at their profiles and check out their work.


Related Articles