Sentiment Analysis: Beyond Words

Published in

Towards Data Science

6 min readMar 28, 2019

As any New York transplant knows, it’s hard to get good pizza outside of New York. Add Italian family heritage and, well, good pizza is a big deal. So when I endeavored to develop a Natural Language Processing project to explore tools for sentiment analysis, what better use of these tools than to help people like me find the perfect pizza place!

For this project I used restaurant reviews from the Yelp challenge dataset. In particular, I used restaurants that Yelp had classified as being in the “pizza restaurant” or “restaurant, pizza” categories.

Understanding that diners like and dislike restaurants for different reasons, I wasn’t so interested in the ratings a user gave a place. Rather, I wanted to get at what particular aspects a diner does and doesn’t like, so that a prospective diner can get a better sense of whether the place will fit what they are looking for. This is particularly important for those restaurants with 3–4 stars reviews, which might get this rating aggregate for a myriad of different reasons.

For example, maybe you are looking for a good authentic NY-style slice of pizza. In that case, service and timeliness might not matter so much, so long as the cooks deliver. On the other hand, maybe you are planning a group dinner for a friend’s birthday. In that case, service and atmosphere will be quite important to the experience — in the case where you have some friends who are picky eaters or have food sensitivities, so, too, will a menu that offers food choices beyond just pizza. Finally, if you are on a lunch break but craving melted cheese of any form, food quality and atmosphere probably don’t matter as much as the timeliness of the food preparation.

So how to get this aspect-based sentiment for pizza restaurants? First, I split each review into sentences, and used spaCy and gensim to get distinct topics that reviewers mentioned in each sentence (namely, food quality, service, wait times, atmosphere, and menu variety). Once I had my topics (I’ll leave topic modeling for another blog), I needed to figure out if a reviewer felt positively or negatively about that aspect of the restaurant. This post compares two ways to model reviewer sentiment: VADER and StanfordCoreNLP.

Sentiment scoring with VADER

First, I tried the VADER sentiment package, and defined a function sentiment_analyzer_scores() to return the overall sentiment rating from -1 (very negative) to 1 (very positive).

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import re
import stringanalyzer = SentimentIntensityAnalyzer()def sentiment_analyzer_scores(text):
    score = analyzer.polarity_scores(text)
    print(text)
    print(score)

The first sentence I tried was pretty straightforward: “this place was amazing great food and atmosphere”. This review is clearly positive and, sure enough, the VADER compound sentiment score was 0.84. So far so good.

text_pos = 'this place was amazing  great food and atmosphere'
sentiment_analyzer_scores(text_pos)

VADER also did well on this pretty straightforward negative review, returning a compound score of -0.66:

text_neg = 'i didnt like their italian sub though just seemed like lower quality meats on it and american cheese'
sentiment_analyzer_scores(text_neg)

However, on this more nuanced example, it gets stuck. Take the review “everything tastes like garbage to me but we keep coming back because my wife loves the pasta”. This reviewer clearly does NOT like this restaurant, despite the fact that his or her wife “loves” the pasta (side note, this reviewer should win spouse of the year for continuing to eat garbage to please their wife!) Any food review with the word garbage should be an immediate negative, but VADER returns a very positive score of 0.7.

text_amb = "everything tastes like garbage to me but we keep coming back because my wife loves the pasta"
sentiment_analyzer_scores(text_amb)

So what happened? This function below returns a list of words that VADER categorizes as positive, neutral, and negative. According to the readme, “VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media.” As such, it relies on the polarity of certain words to determine the overall sentiment.

import nltk
nltk.download('punkt')
nltk.download('vader_lexicon')
from nltk.tokenize import word_tokenize, RegexpTokenizerdef get_word_sentiment(text):
    
    tokenized_text = nltk.word_tokenize(text)
    
    pos_word_list=[]
    neu_word_list=[]
    neg_word_list=[]for word in tokenized_text:
        if (analyzer.polarity_scores(word)['compound']) >= 0.1:
            pos_word_list.append(word)
        elif (analyzer.polarity_scores(word)['compound']) <= -0.1:
            neg_word_list.append(word)
        else:
            neu_word_list.append(word)print('Positive:',pos_word_list)        
    print('Neutral:',neu_word_list)    
    print('Negative:',neg_word_list)

As the output below shows, the positive polarity of the words “loves” and “like” must be quite high. Further, without a broader syntactic understanding of this sentence, the only word that would register this sentence as negative is “garbage”. In this case, “garbage” is considered neutral, and the overall text is determined to be rather positive.

get_word_sentiment(text_amb)

Enter Stanford Core NLP

Stanford’s Core NLP program has just the solution to this problem, since it was trained on movie reviews wherein a reviewer might discuss both positive and negative movie aspects in the same sentence (e.g. “the plot was slow but the acting was great”).

According to the site, rather than looking at the sentiment of individual words, the model “actually builds up a representation of whole sentences based on the sentence structure. It computes the sentiment based on how words compose the meaning of longer phrases. This way, the model is not as easily fooled as previous models.”

Perfect! Luckily, too, there’s a Python wrapper that lets you make calls to the Core NLP Server (which returns results surprisingly quickly). To make the calls, you’ll need to pip install pycorenlp, and import StanfordCoreNLP from pycorenlp. Then, in the terminal, cd into the Stanford CoreNLP folder and start the server with:

cd stanford-corenlp-full-2018-10-05
java -mx5g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000

Great — now let’s see how it did.

#!pip install pycorenlp
from pycorenlp import StanfordCoreNLPnlp = StanfordCoreNLP('http://localhost:9000')def get_sentiment(text):
    res = nlp.annotate(text,
                       properties={'annotators': 'sentiment',
                                   'outputFormat': 'json',
                                   'timeout': 1000,
                       })
    print(text)
    print('Sentiment:', res['sentences'][0]['sentiment'])
    print('Sentiment score:', res['sentences'][0]['sentimentValue'])
    print('Sentiment distribution (0-v. negative, 5-v. positive:', res['sentences'][0]['sentimentDistribution'])

Passing in the review of the food as garbage, the model classifies the overall sentence as pretty negative (0 is most negative, 4 is most positive). The sentiment distribution shows that there are some neutral and even positive aspects of this sentence, but overall this is not a good assessment.

get_sentiment(text_amb)

There’s also a cool live demo that shows how the model parses different points of the sentence into positive and negative aspects:

http://nlp.stanford.edu:8080/sentiment/rntnDemo.html

For good measure I’ll pass in the positive and negative sentences from above:

get_sentiment(text_pos)

get_sentiment(text_neg)

So there you have it, a nuanced sentiment analysis package perfect for reviews of movies, books, consumer goods, and… pizza!

I should point out that this post is by no means a critique of VADER — it has some great features, such as its ability to recognize social media colloquialisms (“LOL”, emojis), and to pick up on emphasis from all caps and punctuation. Rather, my aim is to highlight a sentiment analysis tool that is well-suited for customer reviews containing a combination of positive and negative aspects.

I hope you find this post helpful and welcome any feedback or questions in the comments!

Sentiment Analysis: Beyond Words

Written by Becca R