Sentiment Analysis Of Political Speeches Using Hugging Face’s Pipeline Feature

Who would you rather trust to assess political speeches — human pundits, or an algorithm? A test run on Covid-19 speeches in Singapore between February and June 2020 turns up some interesting results.

Chua Chin Hon
Towards Data Science

--

Election season fills many of us with dread. But it is also a good opportunity to test out some of the new NLP features out there with a practical use case.

In recent weeks, I’ve been experimenting with Hugging Face’s (HF) pipelines feature to see how they can be used with ease and speed to analyze/summarize political speeches in Singapore. This post will outline my attempts to conduct short and long-term sentiment analysis of said speeches, delivered between February and June 2020, with HF’s pipeline feature.

I find the results pretty impressive, despite just using the default model without additional fine tuning with local data. With elections coming up in countries like the US and Singapore, there is further scope for testing how well these transformers-based models hold up under different political contexts.

REPO, DATA AND CAVEATS

The Github repo for this post contains a notebook and the data needed to generate some of the charts in this post, as well as a sample of the Plotly chart and CSV table of the results. The code can be easily tweaked if you wish to generate results for multiple speeches in one go.

The data comprises six official speech transcripts taken from the websites of the Singapore Government as well as the Prime Minister’s Office. These speeches focused on the Government’s plans to deal with the challenges from Covid-19, and are set to frame the broader debate for Singapore’s upcoming election. Some excessively long chunks of text were broken up into smaller paragraphs for a fairer assessment of the sentiment, but the vast majority of the speeches were analysed in their original form.

A more rigorous application of sentiment analysis would require fine tuning of the model with domain-specific data, especially if specialized topics such as medical or legal issues are involved. In this post, however, I’m using the HF pipeline for sentiment analysis “out-of-the-box”, meaning results are based on the default Distilbert model (more specifically: distilbert-base-uncased-finetuned-sst-2-english). HF’s website has a growing list of other models compatible with the tasks of sentiment analysis and text classification if you wish to experiment further.

ANALYZING THE SENTIMENT OF 1 SPEECH

Thanks to the good folks at HF, the pipeline API takes care of the heavy lifting in terms of the complex coding required. Executing the sentiment analysis just takes a few lines of code, after simple text pre-processing (splitting into paragraphs) and cleaning.

The pipeline generates a sentiment label as well as a score. By extracting the labels and scores into a separate data frame, we can easily inspect the results and see where the model might have made mistakes:

The tricker task, for me, is in finding a good way to visualize and annotate the results. I eventually settled on a combination of Plotly and Google Slides for clarity and ease of use. Feel free to switch to a visualization library of your choice.

This is the sample results from the sentiment analysis of the first speech in the dataset:

HF’s sentiment analysis pipeline assessed 23 of this speech’s 33 paragraphs to be positive.

At a glance, you can tell where and for how long a speaker dwelled in the positive or negative territory. In this case, we can see how Singapore Prime Minister Lee Hsien Loong balanced out the grim warnings of Covid-19’s economic impact with assurances of the Government’s plans to help Singaporeans cope with job losses. He also ended his June 7 speech on a positive note by rallying the country around a common challenge.

ANALYZING THE SENTIMENT OF A GROUP OF SPEECHES

If we group a series of these “sentiment structure” charts together, we can quickly get a sense of whether speakers from the same outfit are delivering a consistent message, tone-wise, or if each person is singing a different tune. You can also use this to observe how sentiment on a subject changed over time.

The chart below features an example using 10 speeches to show how political leaders in Singapore changed the tone of their speeches on Covid-19 between February and June, ahead of expected polls. If you compare the left and right columns, you’ll notice a distinct shift in overall sentiment (on Covid-19) from being predominantly negative in tone between February and late April, to a more positive note in June:

The results are not surprising as the outbreak significantly worsened in Singapore in April, before the Government successfully contained it by May. With an election around the corner, the top political leaders clearly needed to switch to a new communication strategy — and tone — which they did in June.

The speeches for February and late April are not included in the repo, but are available in the links above to the two government websites.

I believe this can be applied to analyzing, say, Trump’s change in tone between the initial and later phases of the Covid-19 outbreak. I’ll be keen to test this out during the Biden-Trump debates as well.

WHERE THE ALGORITHM TRIPS UP

Results from the speeches analysed in this post can be downloaded here for a quick view. See if you agree with the algorithm’s assessment.

There are mistakes, of course. The chart below shows at least two areas where the Distilbert model got it wrong when analysing a June 9 speech by Singapore’s National Development Minister Lawrence Wong’s Covid-19:

The algorithm wrongly labelled a portion of his speech discussing Singapore’s new testing and contact tracing capabilities as negative, when it should be considered neutral or positive. It tripped up again towards the end of the speech, when it wrongly labelled a paragraph praising the contributions from the business community as negative.

Elsewhere, the algorithm can be “defeated” by vague and generic descriptions of a dire situation, such as the second paragraph from a June 11 speech by Singapore’s Senior Minister Teo Chee Hean:

The language in the paragraph above greatly understates the extent and severity of the disruption caused by Covid-19, and the algorithm interestingly labelled it as positive. A human analyst would have arrived at the opposite conclusion.

END NOTE + FURTHER WORK

An obvious area of improvement would be in fine tuning the model with a custom dataset on Singapore politics and/or Covid-19. The task, however, doesn’t look trivial (to me anyways) and finding a relevant and labeled training dataset is not easy.

I would also be interested to see how results from HF’s pipeline stack up against other known approaches in sentiment analysis.

But overall, I’ve been pretty impressed by the results from the HF pipeline despite the obvious limits of this approach. The sentiment surrounding a speech clearly doesn’t just depend on the words used but also on the speaker’s delivery and demeanor.

A full AI-based sentiment analysis of speeches should ideally incorporate the use of computer vision and audio forensics to better understand how a speaker’s facial expressions and speaking voice add to the overall sentiment of the speech. But this is beyond the scope of this post.

Meanwhile, let me know if you spot errors or do something interesting with this approach. Ping me at:

Twitter: Chua Chin Hon

LinkedIn: www.linkedin.com/in/chuachinhon

--

--