US Presidential Voices Over the Ages

Analyzing the message, sentiment, & sophistication of words in presidential history using Natural Language Processing & Topic Modeling.

Celina Plaza
Towards Data Science

--

US presidential speech podium
Official White House photo by Pete Souza

“Speech is power: speech is to persuade, to convert, to compel.” — Ralph Waldo Emerson

Since the dawn of the United States of America, US presidential speeches have served as both a reflection of the current state of the nation and a call for changes in a direction that the President believes the country should go.

Presidential speeches provide an insight of what the nation’s leaders are thinking and hoping for the nation’s direction, and how they intend to use their power to fuel towards that direction. Their delivery of that information will affect the public’s reception to the message and ability for action to take place.

Using data science techniques and tools for Natural Language Processing and Unsupervised Learning, I set out to better understand Presidents’ use of their speeches’ power by examining the sentiment, sophistication of speech, and focus of content for over 990 presidential speeches. I would then look for trends, patterns, and other insights over time and by political party.

From George Washington’s 1789 First Inaugural Address to Jimmy Carter’s 1977 Address to the Nation On Energy to Donald Trump’s 2019 State of the Union, every president to date (year 2020) has representation in the speeches that were analyzed in this project*.

With that, here are my findings…

Sentiment of US Presidents

Tools used: TextBlob’s polarity and subjectivity

Speech sentiment was measured in two ways: polarity (i.e. more negative/sad in tone to more positive/happy in tone) and subjectivity (i.e. more fact-based to more opinion-based).

Below are the sentiment analysis results where each dot represents a different President and each color represent a political party >

Screenshot by Author

As you can see in the graph, there are no strong clusters of colors, meaning there doesn’t seem to be a clear distinction of sentiment of Presidents by party — it seems to be more of a shift by individual.

Here you can see names of Republican Presidents >

Screenshot by Author

And here you can see names of Democratic Presidents >

Screenshot by Author

From these graphs we can see for example that compared to other US Presidents, Franklin Pierce was on average more negative/sad in tone and more factual in content, while Donald Trump is on average more positive/happy in tone and more opinionated in content.

Keep in mind that polarity and subjectivity does not tell us what that President is saying; a President could be saying something positively but it could have negative impacts on people. This sentiment analysis only tells us how that President is delivering that content and whether they’re using facts or opinions to back their statements.

Sophistication of Speech of US Presidents

Tools used: textstat’s grade level analysis

‘Sophistication of speech’ is a measurement based on what grade level a person would need to be in order to read text — adapted here to mean grade level required to fully understand a speech when hearing it.

The below graph shows how each speech over the years is rated at grade-level required to understand it. Colors still represent political parties >

Screenshot by Author

From the graph we can see that in the early years of the United States, Presidents had a higher sophistical of speech, on average requiring a college-level or higher education to read/understand it. Then around the 1920s there is trending dip down in sophistication of speech that has continued through today. A potential reason for this could be the President’s own vocabulary, or it could be a more calculated decision; the 1920s was a period when radios became more popular in households — as presidential speeches were able to be heard by more, did Presidents change their vocabulary to relate and connect more widely with the public hearing them? This analysis cannot tell us definitively, but it’s an interesting area for exploration.

Also note that political parties are mixed in all grade levels, indicating that there is no clear difference in terms of sophistication of speech by political parties.

Topics of US Presidential Speeches

Tools used: Unsupervised topic modeling with gensim’s LDA model

Now let’s try to get a better sense of the content of the Presidential speeches.

Using Latent Dirichlet Allocation (LDA) topic modeling, seven topics were identified for Presidential speeches:

  1. American jobs and family help & needs
  2. Law, constitution, and rights
  3. Laws, treaties, and action
  4. Public power & duty
  5. War with American freedom
  6. Work & business
  7. World peace with war & force

What’s important to note between these topics is the positioning and balance of words, such as looking at “War with American freedom” and “World peace with war & force” where a goal is buffered with other objectives, such as going to war….to have American freedom. Or finding world peace….but will also have war and force.

Here are the how the seven topics have trended in presidential speeches over the years >

Screenshot by Author

Again we don’t see a lot of distinction between parties on on topics of speeches and instead see more difference over time. In the early years as the United States was developing we see that ‘Laws, treaties, and action’ and ‘Public power and duty’ were more common. Then in more recent years we see that ‘American jobs and family help & needs,’ ‘World peace with war & force,’ and ‘War with American freedom’ are more popular for Presidents.

Let’s examine these topics in another way by looking at where a topic appeared in American history according to different US historical periods. Colors represent each political speech topic >

Screenshot by Author

From this graph we can see that certain historical periods seem to serve as the ending or advent for topics of presidential speeches. For example, during the era of the New Deal we first saw “American jobs & family” rise to be a primary topic of a presidential speech and we continued to see it show up in nearly every era since.

SUMMARY OF FINDINGS

In summary, here’s what the analysis of this project told us:

  • Sentiment seems to vary by President, not necessarily party.
  • ‘Sophistication’ of words in speeches has trended down over years.
  • Topics of speeches vary more over the years vs by political party.
  • Topics of speeches seem to have a relationship with the state-of-nation and concerns of public.
  • Historical periods may serve as a catalyst for a shift in speech topics for generations.

Finally, I want to again emphasize that this analysis showed some trends and patterns, but also equally important showed us that on the surface there can be similarities of tone and positioning of speeches across Presidents and political parties. With that in mind, we’ll all need to pay attention to the deeper substance of Presidential speeches to make informed votes and informed support for Presidents. That may not be a revelation, but it is a critical reminder.

Thank you for reading and hope you found this analysis and insights interesting. Please contact me via LinkedIn if you have any comments, questions, or just want to connect.

A few notes about this work:

  • Keep in mind parties have evolved and changed over time — just as people do. e.g. A Democratic of today does not necessarily have the same agenda or value set as one in 1850. Therefore, analysis on the political-party level should be done with that caveat and consideration in mind. Any insights provided about political parties is a generalization, not a hard line of conclusion.
  • I am currently a data science student at Metis and learning more each day. This work was an application of my most recent education in the data science field, but I am still growing in this field so I anticipate wanting to revisit and update my findings as my skills advance. I welcome any advice on future iterations.
  • All of my graphs were made in Tableau and are interactive with filtering options based on interests. Graphs shown in this blog are screenshots of those Tableau graphs.

*A huge thanks to Joseph Lilleberg who collected all the transcripts of presidential speeches from The Miller Center at the University of Virginia’s website and offered them on Kaggle.com in an easy-to-use csv for public use. Thanks also to The Miller Center for making the original transcripts publicly accessible.

--

--