
In my previous article found here, I provided a step-by-step guide on how to perform topic modeling and sentiment analysis using VADER on Amazon Alexa reviews. From my analysis I realized that there were multiple Alexa devices, which I should’ve analyzed from the beginning to compare devices, and see how the negative and positive feedback differ amongst models, insight that is more specific and would be more beneficial to Amazon (insert embarrassed face here). Here, I will be categorizing each review with the type Echo model based on its variation and analyzing the top 3 positively rated models by conducting topic modeling and sentiment analysis.
To review, I am analyzing reviews of Amazon’s Echo devices found here on Kaggle using NLP techniques.
Let’s first import our libraries:
from wordcloud import WordCloud, STOPWORDS
import pandas as pd
import pickle
import seaborn as sns
import matplotlib.pyplot as plt
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.model_selection import train_test_split
import gensim
from gensim import corpora
from gensim.models import LdaModel, LdaMulticore
from nltk.corpus import stopwords
from gensim.models.word2vec import Word2Vec
from multiprocessing import cpu_count
import gensim.downloader as api
import plotly.express as px
import plotly.graph_objects as go
from sklearn.feature_selection import chi2
Using pickle, we will load our cleaned file from data preprocessing (in this article, I discussed cleaning and preprocessing for text data) and take a look at our variation column.
with open('Saved Models/alexa_reviews_clean.pkl','rb') as read_file:
df = pickle.load(read_file)
df['variation'].value_counts()

I’m not very interest in the Fire TV Stick as it is a device limited to TV capabilities, so I will remove that and only focus on Echo devices.
df=df[df.variation!='Configuration: Fire TV Stick']
df['variation'].value_counts()

Great, now let’s separate these variations into the different Echo models: Echo, Echo Dot, Echo Show, Echo Plus, and Echo Spot.
# ECHO 2nd Gen - charcoal fabric, heather gray fabric,
# sandstone fabric, oak finish, walnut finish
df['model']=np.where(df.variation.str.contains('Charcoal Fabric ') |
df.variation.str.contains('Heather Gray Fabric ') |
df.variation.str.contains('Sandstone Fabric ') |
df.variation.str.contains('Oak Finish ') |
df.variation.str.contains('Walnut Finish '),'echo',df['variation'])
# ECHO DOT - black dot, white dot, black, white
df['model']=np.where(df.variation.str.contains('Black Dot') |
df.variation.str.contains('White Dot') |
df.variation.str.contains('Black') |
df.variation.str.contains('White'), 'echo dot', df['model'])
# ECHO SHOW - black show, white show
df['model']=np.where(df.variation.str.contains('Black Show') |
df.variation.str.contains('White Show'), 'echo show', df['model'])
# ECHO PLUS - black plus, white plus
df['model']=np.where(df.variation.str.contains('Black Plus') |
df.variation.str.contains('White Plus'), 'echo plus', df['model'])
# ECHO SPOT - black spot, white spot
df['model']=np.where(df.variation.str.contains('Black Spot') |
df.variation.str.contains('White Spot'), 'echo spot', df['model'])
Next, we will separate our original df, grouped by model type and pickle the resulting df, to give us five pickled Echo models.
# SIMILAR FOR EACH MODEL TYPE
echo=df[df['model']=='echo']
pickle.dump(echo,open("Saved Models/echo.pkl","wb"))
Now, let’s look at some visualizations of the different Echo models, using plotly (which I’ve become a HUGE fan of).
values=df['model'].value_counts()
fig = go.Figure(data=[go.Bar(x=values.index, y=values, text=values, textposition='auto')])
fig.update_xaxes(title_text='Echo Models')
fig.update_yaxes(title_text='Number of Models')
fig.update_layout(title_text='Distribution of Echo Models')
fig.show()

fig = go.Figure(data=[
go.Bar(name='echo', x=echo_values.index, y=echo_values, text=echo_values, textposition='auto'),
go.Bar(name='echo spot', x=echospot.index, y=echospot, text=echospot, textposition='auto'),
go.Bar(name='echo show', x=echoshow.index, y=echoshow, text=echoshow, textposition='auto'),
go.Bar(name='echo dot', x=echodot.index, y=echodot, text=echodot, textposition='auto'),
go.Bar(name='echo plus', x=echoplus.index, y=echoplus, text=echoplus, textposition='auto'),
])
fig.update_xaxes(title_text='Ratings')
fig.update_yaxes(title_text='Number of Ratings')
fig.update_layout(title_text='Distribution of Echo Ratings Across Models')
# Change the bar mode
fig.update_layout(barmode='group')
fig.show()

From these graphs we can see that the most common Echo model amongst the reviews is the Echo dot, and that the top 3 most popular Echo models based on rating, is the Echo dot, Echo, and Echo Show. I decided to only focus on these three models for further analyses.
To find out if the sentiment of the reviews matches the rating, I did Sentiment Analysis using VADER on the top 3 Echo models.
# FUNCTION USED TO CALCULATE SENTIMENT SCORES FOR ECHO, ECHO DOT, AND ECHO SHOW.
def sentimentScore(sentences):
analyzer = SentimentIntensityAnalyzer()
results = []
for sentence in sentences:
vs = analyzer.polarity_scores(sentence)
print(str(vs))
results.append(vs)
return results
Using this function, I was able to calculate sentiment scores for each review, put them into an empty dataframe, and then combine with original dataframe as shown below.
# ECHO
with open('Saved Models/echo.pkl','rb') as read_file:
echo= pickle.load(read_file)
echo_sent = sentimentScore(echo['new_reviews'])
echo_sent_df = pd.DataFrame(echo_sent)
echo.index = echo_sent_df.index
echo_sent_df['rating_1'] = echo['rating']
echo_vader = pd.concat([echo, echo_sent_df], axis=1)
echo_vader.head()

The above code was done for the Echo Dot and Echo Show as well, then all resulting dataframes were combined into one.
I then took the average positive and negative score for the sentiment analysis. As you can see from the charts below, the average positive sentiment rating of reviews are 10 times higher than the negative, suggesting that the ratings are reliable.


Next, I performed Topic Modeling on the top 3 Echo models using LDA. In a process identical from my previous post, I created inputs of the LDA model using corpora and trained my LDA model to reveal top 3 topics for the Echo, Echo Dot, and Echo Show.

For the Echo, the most common topics were: ease of use, love that the Echo plays music, and sound quality.

For the Echo Dot, the most common topics were: works great, speaker, and music.

For the Echo Show, the most common topics were: love the videos, like it!, and love the screen. It should be noted that these topics are my opinion, and you may draw your own conclusions from these results. From these analyses, we can see that although the Echo and Echo Dot are more popular for playing music and its sound quality, users do appreciate the integration of a screen in an Echo device with the Echo Show.
Next, using a count vectorizer (TFIDF), I also analyzed what users loved and hated about their Echo device by look at the words that contributed to positive and negative feedback.
neg_alexa = echo[echo['sentiment']=='negative']
pos_alexa = echo[echo['sentiment']=='positive']
# Echo Model - Negative (change neg_alexa to pos_alexa for positive feedback)
tfidf_n = TfidfVectorizer(ngram_range=(2, 2))
X_tfidf_n = tfidf_n.fit_transform(neg_alexa['new_reviews'])
y_n = neg_alexa['rating']
chi2score_n = chi2(X_tfidf_n, y_n)[0]
scores = list(zip(tfidf_n.get_feature_names(), chi2score_n))
chi2_n = sorted(scores, key=lambda x:x[1])
topchi2_n = list(zip(*chi2_n[-10:]))
x_n=range(len(topchi2_n[1]))
fig, ax = plt.subplots(figsize=(16,9))
ax.barh(x_n, topchi2_n[1], align='center', alpha=1, color='salmon')
plt.title('Echo Negative Feedback', fontsize=24, weight='bold')
# x-axis
plt.xlabel("Feature Score", fontsize=22, weight='bold')
plt.xticks(fontsize=18)
#y-axis
labels = topchi2_n[0]
plt.yticks(x_n, labels, fontsize=18)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(True)
ax.spines['left'].set_visible(True)
fig = plt.gcf()
plt.show()
plt.draw()


From these graphs we can see that for some users, they thought that the Echo worked awesome and provided helpful responses, while for others, the Echo device hardly worked and had too many features.
Let’s see the words that contributed to positive and negative sentiments for the Echo Dot and Echo Show.


For the Echo Dot, we can see for some users it is a great device and easy to use, and for other users, the Echo Dot did not play music and did not like that you needed prime. Lastly, let’s see the results for the Echo Show.


From these graphs, users enjoy that they are able to make calls, use youtube and the Echo Show is fairly easy to use, while for other users, the Echo Show is "dumb" and recommend not to buy this device.
Analyzing Amazon Alexa devices by model is much more insightful than examining all devices as a whole, as this does not tell us areas that need improvement for which devices and what attributes users enjoy the most.
Thank you for reading! Here is a link to the Github repo 🙂