What about ‘The Office’?
When I was younger, I heard about The Office all the time and kept bumping into scenes of it on the internet. I decided to give it a shot – it seemed like a cool show after all – but being in university and never having worked in an office, I didn’t get it and gave up after one or two episodes. Some years later, after having worked in a corporate office for a while, I randomly found it on Netflix and decided to give it another shot. I was star-struck! After watching a few episodes, I even felt as though I might be working for Dunder Mifflin. Every character seemed relatable and intimate, and it was kind of a relief to see that the existential dread of the average 9–5 job is a universal experience. I loved it!
Nevertheless, Michael always gave me the ick. I couldn’t understand him. He seemed foolish and often bothered everyone, yet we don’t actually get to hate him because he’s such a sweet and nice guy deep down. If you really think about it, everyone in the show is a bit of an asshole. Michael, Dwight, and Angela for sure, but over time, you also realize that Jim was quite annoying, and even Phyllis or Pam are salty at times and wouldn’t be much fun to work with in real life. Again, there are layers and layers to the humor of the series, much of which is non-verbal and relies on being relatable if you have similar working experiences. Additionally, the mockumentary filming style – facing the camera, lying in the interviews, etc. – adds another layer of humor complexity. Because of this exact complexity of the characters, the plot, and the direction, it is difficult to exactly pinpoint who is an asshole and who is a good guy.
This got me wondering, does the outlook we get on the characters make sense from the actual words and lines they deliver, or is it mostly non-verbal? To put it differently, if we were to analyze just what the characters say, leaving the non-verbal cues aside, would they be the same characters, or at least somewhat similar? Most importantly, if we could see and judge the characters objectively, would we like them? Would we be happy with a boss like Michael?
With this in mind, when I stumbled upon the SchrutePy library, providing the entire text transcripts from The Office in a tidy format, I was thrilled. In fact, I was so thrilled that I decided to give it a go and do a sentiment analysis of The Office, particularly focusing on Michael Scott.😉
More specifically, in the rest of this post I am going to:
- Acquire the transcripts of The Office in a tidy format using SchrutePy
- Perform a sentiment analysis on this dataset using NLTK and Hugging Face Transformers
- Visualize the results using Plotly
… and to do all these, I will be using Jupyter Lab, since it plays out well with Plotly, which I love for creating visualizations in Python.
Let’s go! 🤸♀️️
Get the scoop! 🍦 Get notified whenever Maria drops a new post.
What about Sentiment Analysis, NLTK, and Hugging Face Transformers?
Sentiment Analysis is a Natural Language Processing (NLP) technique used to determine the sentiment expressed in a piece of text. In general, it classifies text as positive or negative, or maybe some other emotion, providing insights into the text’s sentimental tone. The analyzed text can be anything -tweets, emails, customer chats, reviews, comments, books, or transcripts – offering companies and organizations endless opportunities for data analysis and meaningful insight extraction.
In general, there are two major approaches to sentiment analysis: rule-based and machine learning.
- In the rule-based approach, specific words (lexicons) are hard-coded in association to a certain sentiment (positive, negative or neutral, or more complex sentiments such as anger or disgust). In other words, this approach is based on predefined rules and word associations. For instance, the NRC lexicon associates the word ‘abuse‘ with anger, disgust, fear, and sadness, or the word ‘adore‘ with joy, anticipation, and trust.
- On the flip side, the machine learning approach allows for a more nuanced interpretation of the sentiment of a text, as it doesn’t just consider individual words, but also the order of the words (the meaning, if you would), similarly to how a human would evaluate it. This is especially useful in the case that the text contains multiple words with contradicting sentiments. For example if someone said, ‘The movie is surprising, with plenty of unsettling plot twists.’, it means they most likely enjoyed the movie. Nevertheless, according to the rule-based approach the word ‘unsettling‘ has generally a negative sentiment.
Finally, the rule-based and machine learning approaches can be combined to form a hybrid approach to sentiment analysis. Hybrid sentiment analysis methods generally produce better results; however, they require more resources. In any case, human language still encompasses complexities that are challenging for algorithms to handle, as for instance negation like ‘I do not dislike cabin cruisers.’, or sarcasm like ‘Yeah, great. It took three weeks for my order to arrive.’.
In this post, I will be performing a sentiment analysis of The Office using NLTK and Hugging Face Transformers libraries. NLTK is a popular Python library for text processing, offering tools for classification, tokenization, stemming, tagging, parsing, and more. The Hugging Face Transformers library is a game-changer in the field of NLP. It provides thousands of pre-trained models for a variety of tasks such as text classification, sentiment analysis, named entity recognition, and text generation. By utilizing NLTK and Hugging Face Transformers, I aimed to achieve a thorough understanding of the sentiment dynamics of The Office, particularly focusing on Michael’s character. Given that the script of the series contains tons of sarcasm, with complex language, and humor derived from multiple layers of associations, I was genuinely curious about what the analysis results would reveal.
That’s what she said!
Setting up the environment 💻
The first thing we need to do is open a new Jupyter Lab notebook. Before everything else, we need to ensure that the essential libraries are installed. These are SchrutePy
, NLTK
, Transformers
, and Plotly
, and we can easily install them using pip:
pip install schrutepy transformers plotly nltk
Next, we import these libraries into our Jupyter Lab notebook, along with pandas
, collections
, and re
, which are going to assist in our analysis:
from schrutepy import schrutepy
import plotly.express as px
import pandas as pd
import re
from collections import Counter
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
import nltk
from nltk.corpus import stopwords
from nltk.corpus import opinion_lexicon
from nltk.tokenize import word_tokenize
We also need to download the following datasets for NLTK:
nltk.download('stopwords')
nltk.download('punkt')
And now we are all set! 👍
Loading the data from SchrutePy 👨 🌾
After setting up our environment in Jupyter Lab, the next thing we need to do is import the dataset that we’ll be using for the analysis. We can easily load the desired data using SchrutePy and take a look at the first few rows by:
# Load the Office script data
df = schrutepy.load_schrute()
df.set_index('index', inplace=True)
df_intial['text'] = df_intial['text'].astype(str)
df.head(5)

More specifically, the dataset includes every line delivered throughout the American version of The Office TV show, along with the season number, episode number and name, the director and writer of the episode, and the character that delivers the line. The column ‘text’ contains the transcript of the line that the character delivers, whereas the column ‘text_w_direction’ also includes any relevant stage directions and actions. For example, in row 12,050 of the dataset, Michael delivers the line ‘I do not buy woman’s clothes. I would not make that mistake again‘, whereas in the ‘text_wdirection’ column, we get the additional info of ‘[Darryl laughs]_’. For the rest of the analysis, I will be using just the ‘text’ column.

After successfully loading the dataset, we can now proceed with the text analysis.
Who’s talking the most? 📢
The entire TV show features a ton of characters – 773 unique characters, to be precise. We can identify the central characters that dominate the conversation by simply counting the lines delivered by each character. Here, I focus on the top 10 characters with the most lines throughout the show and then plot them in a Plotly bar chart.
# Count of lines delivered by each character
character_lines = df['character'].value_counts().reset_index()
character_lines.columns = ['character', 'line_count']
# Top 10 characters
top_10_characters = character_lines.head(10)
# Plotly bar chart
fig = px.bar(top_10_characters,
x='character',
y='line_count',
title='Who talks the most?',
labels={'character': 'Character', 'line_count': 'Line Count'})
fig.update_traces(marker_color='#4169E1')
fig.show()

It comes as no surprise that Michael is the most vocal character, with over 10,000 lines spoken throughout the series. Dwight and Jim follow as the next most talkative characters – a pattern that aligns with their central roles in the storyline. The fact that Michael just can’t shut up implies that maybe he’s not the greatest boss after all.
What Michael is saying: Unigrams, Bigrams & Trigrams 🔤
Now that we have identified Michael as the character who talks the most, it is interesting to take a look at what exactly he is saying. We can do this by examining the most popular words and phrases he delivers and plotting his most common unigrams, bigrams, and trigrams. In general, an n-gram is a continuous sequence of n tokens from a given piece of text.
To do this, I initially filter out the lines delivered by Michael and concatenate them into a single string.
# filter Michael's lines
michael_df = df[df['character'] == 'Michael']
all_text = ' '.join(michael_df['text'].astype(str).tolist())
Then, I load the stop words from NLTK. In general, stop words are common words – such as ‘the‘, ‘is‘, ‘in‘, or ‘and‘ – that are usually filtered out in text analysis because they do not carry any significant meaning. Additionally, I also exclude phrases containing contractions from the tokenization process. In other words, phrases like ‘don’t’ or ‘I’d’ are preserved and considered as single tokens.
# Load stop words
stop_words = set(stopwords.words('english'))
# Tokenize the text while preserving contractions
tokens = re.findall(r"bw[w']*b", all_text)
# Exclude stop words and preserve
filtered_tokens = [token.lower() for token in tokens if token.lower() not in stop_words]
Next, we form the unigrams, bigrams, and trigrams. Here, I chose to exclude bigrams and trigrams that consist of repeated words like ‘ha ha ha’ or ‘go go go‘.
# Create unigrams, bigrams, and trigrams from the filtered tokens
unigrams = filtered_tokens
bigrams = [' '.join(bigram) for bigram in zip(filtered_tokens, filtered_tokens[1:])]
trigrams = [' '.join(trigram) for trigram in zip(filtered_tokens, filtered_tokens[1:], filtered_tokens[2:])]
# Function to filter out n-grams with repeated words
def filter_repeated_ngrams(ngrams):
pattern = re.compile(r'b(w+)b(?: 1b)+')
return [ngram for ngram in ngrams if not pattern.search(ngram)]
# Filter out n-grams with repeated words
bigrams = filter_repeated_ngrams(bigrams)
trigrams = filter_repeated_ngrams(trigrams)
# Count occurrences
unigram_counts = Counter(unigrams)
bigram_counts = Counter(bigrams)
trigram_counts = Counter(trigrams)
Finally, I plotted the results using Plotly:
# Get the most common 10 unigrams, bigrams, and trigrams
most_common_unigrams = unigram_counts.most_common(20)
most_common_bigrams = bigram_counts.most_common(20)
most_common_trigrams = trigram_counts.most_common(20)
# Convert the most common unigrams, bigrams, and trigrams to DataFrames
df_unigrams = pd.DataFrame(most_common_unigrams, columns=['Unigram', 'Count'])
df_bigrams = pd.DataFrame(most_common_bigrams, columns=['Bigram', 'Count'])
df_trigrams = pd.DataFrame(most_common_trigrams, columns=['Trigram', 'Count'])
# Plot unigrams
fig_unigrams = px.bar(df_unigrams, x='Unigram', y='Count', title=' Michael Top 10 Unigrams')
fig_unigrams.update_traces(marker_color='darkgreen')
fig_unigrams.show()
# Plot bigrams
fig_bigrams = px.bar(df_bigrams, x='Bigram', y='Count', title='Michael Top 10 Bigrams')
fig_bigrams.update_traces(marker_color='darkgreen')
fig_bigrams.show()
# Plot trigrams
fig_trigrams = px.bar(df_trigrams, x='Trigram', y='Count', title='Michael Top 10 Trigrams')
fig_trigrams.update_traces(marker_color='darkgreen')
fig_trigrams.show()



Next, I tried to infer some insights from the n-gram analysis:
- Top bigram ‘oh god‘ and top trigram ‘oh god oh‘ suggest that Michael experiences frequent moments of despair, disappointment, or rage.
- Also frequent bigram ‘michael scott‘ indicates that Michael often refers to himself in the third person, which would certainly be a major red flag for a real-life manager.
- Top unigrams ‘know‘ and ‘i’m‘ suggest that Michael often tries to assert his knowledge or express his identity. Phrases like ‘you know’ or ‘I know’ might be common in his dialogues, as they also appear in the most common bigrams and trigrams.
- Bigrams like ‘i’m going‘ and ‘would like‘ show Michael’s attempts to take action and do things.
- Bigram ‘i’m sorry‘ and trigram ‘sorry I’m sorry’ indicate Michael’s frequent need to apologize or correct himself.
- Bigram ‘dunder mifflin‘ indicates Michael’s frequent references to the company.
- Other common unigrams like ‘oh‘ ,’okay‘, ‘like‘ and ‘well‘ suggest a conversational style filled with filler words. This may indicate that Michael is not very formal in his communication style and is accessible as a boss.
Sentiment Analysis: Does Michael have a good vibe? 🙂🙁
Moving forward to the sentiment analysis, Hugging Face Transformers library provides a convenient model, namely ‘distilbert/distilbert-base-uncased-finetuned-sst-2-english’, a which allows labeling text as positive or negative. This model, accessible through the pipeline
API, simplifies the process of applying advanced machine learning techniques to our text data.
We can easily perform the sentiment analysis using the Transformers library and take a glimpse at the results by:
sentiment_analyzer = pipeline('sentiment-analysis')
# 'distilbert/distilbert-base-uncased-finetuned-sst-2-english'
# it is the default model of pipeline('sentiment-analysis'),
# thus we do to explecitly specify it
df['transformer_sentiment'] = df['text'].apply(lambda x: sentiment_analyzer(x)[0])
df['transformer_sentiment_label'] = df['transformer_sentiment'].apply(lambda x: x['label'])
df['transformer_sentiment_score'] = df['transformer_sentiment'].apply(lambda x: x['score'])
df[['character', 'text', 'transformer_sentiment']].head(5)

For each piece of text we provide, the model returns a label indicating whether the sentiment of the text is negative or positive, and a score denoting the confidence of the model in the label. For instance, in the first row of the data, the model identifies the text ‘All right Jim. Your quarterlies look very good. How are things at the library?’ as positive, with a score of 0.999694, suggesting that the model is very confident in this positive label. After evaluating the sentiment of each line, I aggregated the results for the top 10 characters.
# Filter top 10 characters
character_sentence_counts = df['character'].value_counts().head(20)
top_characters = character_sentence_counts.index.tolist()
df_top_characters = df[df['character'].isin(top_characters)]
# Calculate sentiment shares
sentiment_counts = df_top_characters.groupby(['character', 'transformer_sentiment_label']).size().unstack(fill_value=0)
# Calculate total sentences for each character
sentiment_counts['Total'] = sentiment_counts.sum(axis=1)
# Calculate share of positive and negative sentiments
sentiment_counts['Positive'] = 100 * sentiment_counts['POSITIVE'] / sentiment_counts['Total']
sentiment_counts['Negative'] = 100 * sentiment_counts['NEGATIVE'] / sentiment_counts['Total']
Finally, we can plot the results on a horizontal stacked bar chart using Plotly:
sentiment_counts = sentiment_counts.reset_index()
sentiment_counts = sentiment_counts.sort_values(by='Positive', ascending=False)
# Plot horizontal stacked bar chart
color_map = {'Positive': '#228B22', 'Negative': '#FF6347'}
fig = px.bar(sentiment_counts,
x=['Positive', 'Negative'],
y='character',
orientation='h',
title='Share of Words with Positive and Negative Sentiment by Character (Top 10 Characters)',
labels={'value': 'Share (%)', 'character': 'Character'},
height=600,
width=1200,
color_discrete_map=color_map,
)
fig.update_traces(hovertemplate='%{x:.2f}%')
fig.update_layout(barmode='stack', xaxis={'categoryorder': 'total descending'}, showlegend=True)
fig.show()

Michael’s sentiment score is 51.80% positive. From this, we may assume that Michael conveys a relatively balanced sentiment, with a slight tilt towards positivity. Unsurprisingly, the character with the highest positive score is Erin, while the character with the lowest positive score is Angela. Overall, among the top 10 characters, Michael is in the middle (6th, to be precise), which makes him well positioned for the overall mood of the group.
Emotion Analysis: So, what’s Michael’s vibe? 🥰🤔🤢😊😭 🤬
Next, I carried out an emotion analysis using the pipeline
API of the Transformers library. This time, I utilized the ‘monologg/bert-base-cased-goemotions-original‘ model, which allows us to classify a piece of text according to 27 emotion categories along with neutral. These categories are: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, and neutrality.
So, to do this, I first load the pre-trained model:
model_name = "monologg/bert-base-cased-goemotions-original"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
emotion_classifier = pipeline('text-classification', model=model, tokenizer=tokenizer, return_all_scores=True)
Then, we can perform the emotion analysis by:
# Emotion Analysis
def classify_emotions(text):
emotions = emotion_classifier(text)[0]
return {emotion['label']: emotion['score'] for emotion in emotions}
df['emotions'] = df['text'].apply(classify_emotions)
emotion_labels = emotion_classifier(df['text'].iloc[0])[0]
for emotion in emotion_labels:
df[emotion['label']] = df['emotions'].apply(lambda x: x.get(emotion['label'], 0))
The classifier returns an array of labels and scores, similar to those for positive and negative sentiment, but now encompassing all 28 labels. We can then aggregate the scores of the 28 emotions for the top 10 characters. I decided to exclude the neutral emotion from the analysis to focus more on distinct emotions.
# Count sentences by character
character_sentence_counts = df['character'].value_counts().head(10)
# Filter top 10 characters
top_characters = character_sentence_counts.index.tolist()
df_top_characters = df[df['character'].isin(top_characters)]
# Calculate total scores for each emotion per character
emotion_totals = df_top_characters.groupby('character')[emotion_labels[0]['label']].sum().reset_index()
for emotion in emotion_labels[1:]:
emotion_totals = emotion_totals.merge(df_top_characters.groupby('character')[emotion['label']].sum().reset_index(), on='character')
# Convert totals to shares (percentages)
emotion_totals = emotion_totals.drop(columns=['neutral'])
emotion_totals.set_index('character', inplace=True)
emotion_totals = emotion_totals.div(emotion_totals.sum(axis=1), axis=0).multiply(100)
emotion_totals = emotion_totals.reset_index()
Then, we can create a heatmap for the emotions of the top 10 characters:
melted_emotion_totals = emotion_totals.melt(id_vars=['character'], var_name='emotion', value_name='share')
# Plot heatmap
fig = px.density_heatmap(melted_emotion_totals,
x='emotion',
y='character',
z='share',
title='Heatmap of Emotions by Character (Top 10 Characters)',
labels={'share': 'Share (%)', 'character': 'Character', 'emotion': 'Emotion'},
height=600,
width=1000,
color_continuous_scale='dense'
)
fig.show()

At a glance, we can see Angela having high percentages of disapproval and disgust, which comes as no surprise given her character’s frequent displays of criticism and disdain towards her colleagues. Oscar’s high level of curiosity may suggests his frequent bewilderment at the absurdities of office life.
Finally, I filtered out Michael and plotted his scores for each emotion using Plotly.
michael_emotion_totals = emotion_totals[emotion_totals['character'] == 'Andy'].reset_index(drop = True)
michael_emotion_totals = michael_emotion_totals.T
michael_emotion_totals.columns = michael_emotion_totals.iloc[0]
michael_emotion_totals = michael_emotion_totals.drop('character')
michael_emotions_df = michael_emotion_totals.reset_index()
michael_emotions_df.columns = ['emotion', 'share']
# Sort from highest to lowest
michael_emotions_df = michael_emotions_df.sort_values(by='share', ascending=True)
# Plot bar chart
fig = px.bar(michael_emotions_df,
x='share',
y='emotion',
orientation='h',
title='Distribution of Emotions for Michael',
labels={'share': 'Share (%)', 'emotion': 'Emotion'},
text='share',
height=600,
width=1000)
fig.update_traces(texttemplate='%{text:.2f}', textposition='outside')
fig.update_layout(xaxis={'categoryorder': 'total descending'}, showlegend=False)
fig.show()

Lastly, here’s an attempt to interpret the emotion analysis:
- The most prominent emotion expressed by Michael is ‘curiosity‘, comprising 19.08% of his emotional expressions. This suggests that Michael often asks questions and seeks information in his interactions.
- ‘approval‘ (14.70%) and ‘admiration‘ (11.24%) are the next most common emotions. Michael frequently seeks validation or gives approval and expresses respect or positive regard towards others.
After all, was Michael Scott world’s best boss?
Probably not 🤷♀. Michael talks more than anybody else throughout the show, frequently refers to himself in the third person (!), and is super dramatic, regularly exclaiming ‘oh god!’. Nonetheless, the analysis also highlights several good qualities of Michael, such as admitting to his mistakes and apologizing, having an overall reasonably positive attitude, being curious, valuing the approval and opinion of others, and expressing admiration. Is he the world’s B_EST b_oss? Probably not, but he still is pretty good.
On another note, as expected, the text analysis isn’t able to grasp all the complex dynamics of the show, such as practical jokes or the facial expressions that compose a large part of the characters and the humor of the series. Additionally, text analysis sometimes struggles to interpret extreme sarcasm or punchlines specific to the show. For instance, ‘That’s what she said!’ – a very frequent punchline of Michael – is labeled as neutral, whereas in real life it is borderline offensive.
Having said that, despite its shortcomings, text analysis is undeniably an extremely useful tool. It allows us to infer quick insights from vast amounts of data, as well as interpret data objectively without our personal judgments and biases being involved in the analysis. Advanced machine learning models, like Hugging Face Transformers, enhances these analyses, making it possible to handle complex language patterns and provide meaningful insights.
References
SchrutePy – The Entire Transcript from The Office in Tidy Format. Maintained by Brad Lindblad, released under the MIT License, available at PyPI and GitHub.
✨Thank you for reading!✨
Interested for more Data Science tutorials? Take a look at my other posts:
From Data to Dashboard: Visualizing the Ancient Maritime Silk Road with Dash Leaflet and SeaRoute…
Enjoyed this post? Let’s be friends!
💌 Join me on Medium, Substack or LinkedIn!
💼 Hire me on Upwork!