In Their Own Words: 60 Years of Presidential Debates

An NLP analysis of presidential debates from Kennedy-Nixon to Biden-Trump

Published in

Towards Data Science

15 min readNov 3, 2020

As many of us (hopefully) have cast our vote either by mail or early-voting, there may still be a few who have not made up their minds by Nov 3. In the spirit of democracy, I hope this article reaches you before you are at the polls. Inspired by the impending election day, I decided to flip through some of the old presidential debates. Nostalgia might be a key topic for 2020, as people start to realize all the things that were not appreciated in the pre-pandemic world. Election day is an essential cross roads and before choosing a path, it might be useful to look back and see where we have been in our evolving democracy.

This project was initially inspired by an NLP post on visualizations with Scattertext. The post analyzed presidential convention speeches in 2012 (Obama v. Romney) and categorized them by political party. It is a great plot to see a snapshot of the partisan divide illustrated in words. Word frequency and rank also show a snapshot of key topics in the party platform (in the upper and lower diagonals) and common issues along the diagonal. Convention speeches are rich with key party platform phrases e.g.

Republican: Unemployment, Business, Liberty
Democratic: Women, Middle-class, Medicare

Down below, I use Scattertext to explore topic features in the presidential debates from 1960–2020. The debates are also a fascinating case for many reasons. The moderator or panelists choose the topics that are either current issues or more free form topics about leadership. Candidates are not in the comfort of their own supporters, but often times are faced with tough questions from undecided voters. Also, time constraints require them to be brief and to the point, candidates have to measure the words they choose to address the topic. Through each candidate, one can see a snapshot of political party ideology in an election cycle.

Below is a screenshot of a Scattertext plot with PyTextRank to score prominent phrases. These phrases are then associated to a point by their dense rank frequency and the difference in dense rank frequency between the two party categories (Great example here!). In the caption, there is a link to the interactive plot where you can click on a point and see who said what!

Scattertext plot snapshot that shows the ranked frequency along the axes for Democrats (vertical) and Republicans (horizontal). Frequent phrases used by Democrats are closer to the top left corner and Republicans are in the bottom left. The diagonal in yellow represents more neutral words or issues shared by both parties (e.g. trade, democracy, politics, crime). The plot is interactive and allows to click on words and see the debate responses containing the word for all speakers. The interactive plot can be accessed here: (Scattertext Plot Presidential Debates with search bar for terms)

Snooping Around the Plot for features:

Based on one word, we can take a dive deep into the idealogical divide using the intuitive features of Scattertext visualizations. Looking along the diagonal of key phrases that have a score near zero and represent key issues or challenges that both parties need to address. It is like standing in a valley and looking up at two mountains.:

The swing state of Ohio is listed along the diagonal, while Michigan scores further toward the negative because it is mentioned more frequently by Republicans.
Key issues like “the budget” and “trade” are mentioned with an average frequency by both parties and so these points show up towards the center of the plot.
The most frequently discussed topic scored almost equally by both parties in the upper right region of yellow points are “taxes” and “jobs”.

The words that indicate ideology are furthest away from this valley. For example, the phrase “American workers” is exclusively used by Democratic candidates. A similar phrase “younger workers” is used mostly by George W. Bush to describe his plan for Social Security. Words that have equal and opposite sign scores are interesting for comparing the focus of party platforms. For example, NAFTA for Republicans scores as much as “incomes” for Democrats, suggesting that Republicans discuss and criticize NAFTA as much as Democratic candidates focus on household incomes. More instances give an indication of the key talking points for each party:

D: “human rights”, R: “our troops”
D: “tax cuts”, R: “unemployment”
D: “housing”, R: “debt”

Looking at the top left and bottom right corners, you can see what each candidate said about his or her opponent. One interesting feature is in conceding agreement, for example Barack Obama acknowledges a few times that Senator McCain (or “John”) is “absolutely right”, and John McCain also acknowledges this at least once (when referring to tactics and strategies in Afghanistan). Looking at the bottom portion of the plot, the debates get more personal as the opponents are referred to mainly by their first name (bottom left) and many of the frequent terms used by Republicans and not Democrats are dominated by key phrases from Donald Trump:

“Yeah, yeah, we’ve heard — we’ve heard this before, Hillary.”
“Because you know what, there’s nothing smart about you, Joe. 47 years, you’ve done nothing.
“Excuse me. Because she has been a disaster as a senator. A disaster.”
“She’s got bad judgment, and honestly, so bad that she should never be president of the United States. That I can tell you.”
“I don’t know Beau. I know Hunter. Hunter got thrown out of the military. He was thrown out, dishonorably discharged for cocaine use.”

Looking at more substantive topics is more interesting as candidate stances evolve over time. Abortion has been a key issue in the partisan divide between Republicans and Democrats, especially in light of the recent confirmation of Amy Coney Barrett. It was mentioned in the Carter-Ford debate and Mondale-Reagan debate. Reagan and Carter give diametrically opposing statements even though they were not debating each other at the time:

“I think abortion’s wrong. I don’t think the government oughta do anything to encourage abortion. But I don’t favor a constitutional amendment on the subject…I personally don’t believe that the federal government oughta finance abortions, but I — I draw the line and don’t support a constitutional amendment.” -Jimmy Carter
“But with regard to abortion, and I have a feeling that this is — there’s been some reference without naming it here in the remarks of Mr. Mondale tied to injecting religion into government. With me, abortion is not a problem of religion, it’s a problem of the Constitution.” -Ronald Reagan

Reagan states that the issue is about the Constitution and less to do with religion. Carter’s response is a bit muddled. He clearly states that there should not be an amendment that bans abortions, but the government should not have programs to finance abortions. George W. Bush’s response implies the issue is a matter of society building, which is in tension with Reagan’s response about the Constitution. Hillary Clinton and Michael Dukakis state clearly that it is an issue in terms of the right for a woman to choose. Al Gore and John McCain talk about Roe v. Wade in the context of litmus tests for Supreme Court justices, and are in agreement that support to uphold Roe v. Wade is a key indicator for appointment or even nomination.

“I think it’s important to promote a culture of life. I think a hospitable society is a society where every being counts and every person matters. I believe the ideal world is one in which every child is protected in law and welcomed to life.” -George W. Bush
“And when the phrase a strict constructionist is used and when the names of Scalia and Thomas are used as the benchmarks for who would be appointed, those are code words, and nobody should mistake this, for saying the governor would appoint people who would overturn Roe v. Wade. It’s very clear to me. I would appoint people that have a philosophy that I think would have it quite likely they would uphold Roe v. Wade.” -Al Gore
“I would consider anyone in their qualifications. I do not believe that someone who has supported Roe v. Wade that would be part of those qualifications. ” -John McCain

Barack Obama acknowledged that there is a divide and potentially the two sides could share common ground in preventing unintended pregnancies:

“This is an issue that — look, it divides us. And in some ways, it may be difficult to — to reconcile the two views… ‘We should try to prevent unintended pregnancies by providing appropriate education to our youth, communicating that sexuality is sacred and that they should not be engaged in cavalier activity, and providing options for adoption, and helping single mothers if they want to choose to keep the baby.’ Those are all things that we put in the Democratic platform for the first time this year, and I think that’s where we can find some common ground, ...” -Barack Obama

Investigating the Presidential debates through the lens of Scattertext highlights timeless political topics that morph and evolve party platforms over time. The above plot inspired me to look at the topic features in presidential debates, analyze how the topics of debates change over time, and rank candidates by ideology based on their responses using a BERT model trained for text classification.

Analysis Toolkit

The first stage of getting the data is to scrape a webpage that contains the debate transcripts. I used The Commission on Presidential Debates, which has a very simple format. This site misses the most recent set of debates between Donald Trump and Joe Biden, which I got from USA today. I used Beautiful Soup 4 to get the text from these webpages (linked in the resources down below). I link my code here.
Non-negative matrix factorization from Scikit learn is used to extract keywords from the moderator/panelist text. PyTextRank is used to score the most prominent topic phrases.
I use SpaCy to assign keywords to given topic and match candidate responses. I identified the keywords and topics here
The BERT text classification model is trained on Google CoLab GPU with a Tensor Flow classification layer and I have a notebook linked below.
Visualizations and output charts are made with Tableau.

Extracting Timeless Topics

The role of panelists and moderates makes it easy to find debate prompts for each debate. The difficulty in analyzing prompts is that the terminology will change over time. For example, “climate change” is a common term now, but in the 2000 debate it is referred to by Al Gore as “global warming”, so I rely on picking out topic words per debate and then grouping them into thematic topics. Across the debates, there are thematic topics that are revisited across decades: taxes, the environment and climate change, the economy, federal spending, health care, gun control, immigration, national defense, oil and the oil industry, public education, race and discrimination, and social welfare.

I found these topics by building a topic model for the moderator and panelist text for each debate and grouping the keywords into themes. An example is shown below for topic words in the Obama-Romney debate:

gas prices, specific examples, financial problems, the difference, american jobs, productive members, vast array, clear choice, quick response, red lines, other things, everyday living, your energy secretary, your energy, tax revenue, the biggest, the biggest misperception, own plan, his own plan, lower gas prices

The full text is reduced using PyTextRank and keeping only phrases with a score larger than 0.1. The topic modeling is messy because the text is very sparse, and the debate topics are designed to cover a lot of ground. The cohesion of the topic below is low, but the highlighted words show the broad theme of the economy and oil prices. I encode these words as topics using SpaCy match patterns. For the topic “Social Welfare”, the match patterns are listed below based on the words used by the moderators.

matcher.add(“Social Welfare”,None,[{“LOWER”:”social”},{“LOWER”:”security”}])
 matcher.add(“Social Welfare”,None,[{“LOWER”:”housing”},{“LOWER”:”subsidies”}])
 matcher.add(“Social Welfare”,None,[{“LOWER”:”minimum”},{“LOWER”:”wage”}])
 matcher.add(“Social Welfare”,None,[{“LOWER”:”prevailing”},{“LOWER”:”wages”}])
 matcher.add(“Social Welfare”,None,[{“LOWER”:”abnormal”},{“LOWER”:”poverty”}])

Topics across all presidential debates split by the % of the total matched candidate responses

Given the topic keywords, I run over all the candidate responses across the debates and count how often there is a match in each topic. From the above chart, it is clear that the broad domestic topics of the economy, taxes, and federal spending make up almost 50% of the candidates responses over the years. These topics drive much of the debate about domestic policy. The next largest topic is health care shown in green, which makes up 13.8%. Public education is the least discussed topic at below a percent (0.86%).

To get a measure of the amount of substance in each debate compared to other text like personal attacks or cross talk, I look at how frequently there are topic matches to the candidate response text. I do not expect there to be a 100% match efficiency because the keywords chosen do not cover all possible debate questions and prompts. However, it is useful to see how flat the match efficiency is to gauge the quality of debates over time.

Efficiency for the topic keywords to match candidate responses, used as an indicator to see how much substantive discussion there is the debates

Early presidential debates contain more topic words than the later debates. The 1980 debate between Carter and Reagan has the largest percent of matched responses (66%). The debates after 1980 to 2012 are near the average value of 40% ranging from 33 to 46%. The lowest matches are for the more recent debates Clinton-Trump in 2016 and Trump-Biden in 2020.

Breakdown of debate topics for the 1980 Carter-Reagan debates and 2020 Biden-Trump debates

Comparing the highest and lowest match efficiency debates points out how broad or narrow the focus of the debate was. The 1980 debate breaks down nicely between discussions of the oil industry, the overall economy, social welfare, climate change, and race/discrimination. Health care (4%) and national defense (7.2%) are covered to a lesser extent. For the most recent debates, the economy is a larger portion of the discussion along with health care (likely highlighted by COVID-19). Race accounted for only 8% of the matched candidate responses in 1980 with matched keywords: racial discrimination and black unemployment. While in 2020 it is 14% (the largest fraction across all debates) , the list of matched keywords were: white supremacists, civil rights, black community, hispanic, and racist.

BERT Text Classification of Ideology

BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained machine learning technique for natural language processing. BERT pre-trained models can be combined with a classification layer for text classification. The key innovation of BERT models is their ability to recognize context with bi-directional encoders as opposed to only the left-right/right-left sequence of tokens. This allows the training to recognize the context of words based on their surroundings as opposed to only the order they are strung together.

BERT is a powerful tool for text classification of short sentences of a few hundred tokens. The larger the token size of a sentence the more resources it will take for training the model. I parsed the presidential candidate responses in 192 token-length segments. The pre-trained BERT model is listed below and available for download on TensorFlowHub:

# More details here: https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/2bert_layer=hub.KerasLayer(‘https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/2',trainable=True)vocab_file=bert_layer.resolved_object.vocab_file.asset_path.numpy()do_lower_case=bert_layer.resolved_object.do_lower_case.numpy()#### is the Bert layer case sensitive?tokenizer=tokenization.FullTokenizer(vocab_file,do_lower_case)

A TensorFlow input pipeline is made that transforms the input text to a batch of inputs for multi-thread training and testing. The last stage is to add a classifier model after the BERT layer:

def create_model():input_word_ids = tf.keras.layers.Input(shape=(max_tok_sequence,), dtype=tf.int32,name=”input_word_ids”)input_mask = tf.keras.layers.Input(shape=(max_tok_sequence,), dtype=tf.int32,name=”input_mask”)input_type_ids = tf.keras.layers.Input(shape=(max_tok_sequence,), dtype=tf.int32,name=”input_type_ids”)pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, input_type_ids])#### Hard code initialization seeds for reproducibilitydrop=tf.keras.layers.Dropout(0.2,seed=9)(pooled_output)#### tuned the hyperparameter for regularization termoutput=tf.keras.layers.Dense(1,activation=’sigmoid’,name=’output’,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=9))(drop)#### classifier values between 0,1model=tf.keras.Model(inputs={‘input_word_ids’:input_word_ids,’input_masks’:input_mask,’input_type_ids’:input_type_ids},outputs=output)#### keras modelformattedreturn model
model=create_model()model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=2e-5),loss=tf.keras.losses.BinaryCrossentropy(),metrics=tf.keras.metrics.BinaryAccuracy())

Given the 19 candidates, I labeled them as Republican (1) and Democratic (0). The model is trained to give each debate response a classifier score. Only the data matched to keywords is used for training/validation. I look at the responses that are unmatched to keywords and score their ideology:

Classifier scores ranging from 0–1 that classify a text as Democratic or Republican ideology

Each candidate will have a classifier score for each of their debate responses (parsed into 192 token segments). The median score gives an indication of how much they align with their party.

The median score allows to rank how well each candidate aligns with their party (the color scale) based on their debate responses

The scores show that Michael Dukakis, Jimmy Carter, Bill Clinton and John F. Kennedy rank as the most Democratic aligned with median scores close to zero. Nixon, Reagan and McCain all have median scores of about 0.95 and above and rank as being closest aligned to the Republican party. Gerald Ford is slightly anomalous in that he ranks closer to the Democratic platform than the Republican based on his debate responses, so I look at the responses that are classified as most Democratic party aligned. These responses for Gerald Ford include amnesty for draft evaders and military deserters, solving world hunger, and several mentions of “moral principles” and “moral standards” applied to peace efforts in the Middle East.

The winners for the most party aligned statements are Barack Obama for his response on public education and college accessibility in his third debate with John McCain. The highest score for a Republican party aligned statement is from Donald Trump in his third debate with Hillary Clinton. He talks about how he plans to renegotiate trade deals, how NATO has to “pay up”, and how he will terminate NAFTA. In general, the party aligned responses tend to be found in long responses so that there is a high probability to contain a rich amount of party key phrases. Barack Obama’s response included “local school districts”, “public schools”, and “college accessibility and affordability”. This is in line with the features seen in the Scattertext plot earlier, “education” is a word scored highly for the Democratic party. In Donald Trump’s statement several of the highly scored Republican party words are underscored: “NAFTA”, “NATO”, “national debt”, “trade deals”, and cutting taxes.

Of course… the ironies and foreshadow:

You can think of this section as just a bloopers reel of the presidential debates. But on after thought it is not light hearted. I found this question addressed to Gerald Ford very poignant:

MR. MAYNARD,”Mr. President, twice you have been the intended victim of would-be assassins using handguns. Yet, you remain a steadfast opponent of substantive handgun control. There are now some forty million handguns in this country, going up at the rate of two point five million a year. And tragically, those handguns are frequently purchased for self-protection and wind up being used against a relative or a friend. In light of that, why do you remain so adamant in your opposition to substantive gun control in this country? “

A very provocative comment in the most recent debate between Joe Biden and Donald Trump left many in a state of shock. Will Trump accept the results of the election (if he loses presumably) and encourage a peaceful transition? In the first presidential debate with Hillary Clinton his answer was plain:

“The answer is, if she wins, I will absolutely support her.” -Donald Trump

By the third presidential debate with Hillary Clinton, Chris Wallace again posed the question, citing that his running mate and his daughter were still in agreement:

“You have been warning at rallies recently that this election is rigged and that Hillary Clinton is in the process of trying to steal it from you. Your running mate, Governor Pence, pledged on Sunday that he and you — his words — ‘will absolutely accept the result of this election.’ Today your daughter, Ivanka, said the same thing. I want to ask you here on the stage tonight: Do you make the same commitment that you will absolutely — sir, that you will absolutely accept the result of this election? “ Moderator Chris Wallace

In an almost eerie foreshadow for 2020, Trump replied

“What I’m saying is that I will tell you at the time. I’ll keep you in suspense. OK?” -Donald Trump

This seems more like a “Happy Halloween” ending than a “Happy Election” day ending, so I will conclude with some future directions for this code:

Binary text classification could be extended to multi-classification to map a debate response onto a set of topics.
Could a similar methodology be applied to Supreme Court confirmation hearings?
If they talk the talk, do they walk the walk? Compare the topics identified in debates and speeches to voting and policy records after election.

Code

Github link to Webscrape and analysis code

Link to interactive Scattertext plot

BERT Text Classification model

Debate Transcripts from The Commission on Presidential Debates

Link to charts

References

Jason S. Kessler “Scattertext: a Browser-Based Tool for Visualizing how Corpora Differ” arXiv:1703.00565

Jason S. Kessler “Visualizing thousands of phrases with Scattertext, PyTextRank and Phrasemachine” Medium article

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” arXiv:1810.04805