The world’s leading publication for data science, AI, and ML professionals.

What House of Cards Got Right (and Wrong) About Data Science

Given the recent release of two House of Cards season 5 trailers and the impending release of season 5 of the binge-worthy Netflix series…

Given the recent release of two House of Cards season 5 trailers and the impending release of season 5 of the binge-worthy Netflix series, this is a perfect opportunity to do two things:

  • Watch (or re-watch!) the series immediately.
  • Look back at how season 4 of House of Cards used a data scientist and how well his work reflects real-world Data Science.

Spoiler Alert: This post includes spoilers up through Season 4 of House of Cards. If you haven’t watched through season 4, I recommend stopping here.

A quick recap:

In seasons 3 and 4, Francis Underwood (Kevin Spacey) is the President of the United States. He and his wife, Claire Underwood (Robin Wright), are campaigning to win the 2016 Presidential election as President and Vice President, respectively. They enlist Leann Harvey (Neve Campbell) as campaign manager, who oversees and runs their Presidential campaign. She hires Aidan Macallan (Damian Young) as the campaign’s data scientist, whose job is to gather data and information on voters in order to make better campaign decisions.

You can find a more detailed summary of seasons 1–4 here.

Focus Groups

Context: Doug Stamper, the President’s Chief of Staff, suggests that President Underwood selecting his wife as his running mate would be a disaster for the campaign. Leann Harvey, the soon-to-be campaign manager, objects to his use of focus groups as evidence:

Leann: You ran a few polls, so what?

Doug: Focus groups, too.

Leann: Meaningless.

What House of Cards got right: A focus group usually consists of up to 30 people in a room who are asked about how they feel about an individual, an idea, or an advertisement they just watched, etc. Leann is correct in that focus groups aren’t good indicators of how people will vote; if we were to take a poll of 30 random voters, the margin of error for that poll would be as high as 18%. (By contrast, most Presidential polls in America have a margin of error of around 4%.) Having a margin of error of 18% means that your election predictions will be incredibly imprecise. For example, a poll that showed 60% of people would vote for Clinton with 18% margin of error would mean that the "true proportion" of Americans who would vote for Clinton is estimated to be between 42% and 78%! Using a focus group as evidence for why Claire would make a good or bad vice presidential candidate is, well, a little silly and not a very good idea.

In addition, focus groups include discussions among its participants. During these discussions, people often change their mind or say an opinion that seems popular in the group even if they don’t actually feel that way. (The psychologist Solomon Asch has an infamous study about this very idea.) If you’ve ever voted in the United States before, you know that you go to the voting booth by yourself and make your decision away from the pressures of other people. This suggests that focus groups don’t provide accurate estimates of how an election will turn out – because they don’t reflect how we actually vote!

What House of Cards got wrong: While a focus group provides neither an accurate nor precise estimate of election results, that doesn’t mean they’re entirely meaningless, as Leann suggested. Focus groups are often used for qualitative research: for example, how people perceive messaging or how people react to the opinions of others. Listening to how the members of the focus group all react to a commercial they just watched provide insight into how, for example, a family or group of friends might react if they were watching a commercial at home. Understanding the words and thoughts of participants’ reactions can be invaluable to marketers or politicians.

Ken Bone, American hero and global meme.
Ken Bone, American hero and global meme.

Supervised Learning

Context: Data scientist Aidan Macallan and campaign manager Leann Harvey meet in a jazz club with performers playing in the background. Leann wants to know what the Underwoods need to do in order to earn more votes, but Aidan hasn’t found anything.

Leann: "We gave you access – I need more than we are getting. You’re supposed to tell me what people want."

Aidan: "People didn’t know they wanted jazz until they heard it for the first time. I can get them to like the music – I can’t compose it. Give me something I can work with."

Claire Underwood discovering "beyond" in South Dakota.
Claire Underwood discovering "beyond" in South Dakota.

As Aidan and Leann discuss, Aidan doesn’t have the power to create something for the Underwoods to say on the campaign trail… but once Frank and Claire said something that resonated with voters, Aidan could pick up on the voters’ sentiment and encourage the Underwoods to say it over and over again. Aidan discovered that Claire’s use of the word "beyond" in a speech in South Dakota started to swing voters in the Underwoods’ favor. They then used the word "beyond" as frequently as they could – talking

What House of Cards got right: Data science isn’t magic. In the real world, if we want to test whether a political ad will swing voters in one direction, someone must first create the ad, then we can run an experiment to test it. This isn’t limited to Politics. Experiments (often called A/B testing in non-academic settings) are common in many fields including clinical trials, Web site layout, and email marketing campaigns.

Without observing any situation in which voters would be likelier to vote for you, it’s understandably difficult to predict situations that would cause voters to vote for you. Aidan’s point was that he couldn’t predict a situation in which voters would start changing their minds, but once he observed voters changing their minds, he could identify why it happened. (In data science, we use the term supervised learning to refer to situations like this, where we’ve observed some output like increased voter approval.)

While the word "beyond" is perhaps an odd choice for what voters "want to hear," there are many words from the 2016 U.S. Presidential election that likely helped to move voters. Consider the nicknames Donald Trump gave those who opposed him:

  • Lyin’ Ted (Cruz)
  • Little Marco (Rubio)
  • Goofy Elizabeth Warren
  • Crooked Hillary (Clinton)
  • Low Energy Jeb (Bush)

Knowing what words have an impact on voters can be invaluable to someone running for office.

What House of Cards got wrong: Data scientists could make headway on this problem without a particular ad or phrase that plays well. Think about common themes in Super Bowl commercials – dogs, emotional appeals, and political themes seem to be themes of the most-talked-about ads. Data scientists could use machine learning techniques like regression to see what generates the most traffic, even if they can’t yet detect what will cause voters to vote a particular way. Similarly, data scientists could look at existing political ads to try to isolate what makes a "good" ad versus a "bad" one.

Clustering

Context: Data scientist Aidan Macallan is pitching his analytics firm to win a NSA contract that surveils all American phones in order to identify and track potential terrorists. (The Underwoods, however, want to use this data to ensure they win the election.)

What data scientists do when their code is running.
What data scientists do when their code is running.

Aidan: Take firearms, for instance. If we start with everyone who legally owns a gun, track geolocation patterns through their phones, we start to put together a portrait of where gun owners live, eat, shop, everything. From this, we predict everyone who might want a firearm but who isn’t registered. They’re likely to exhibit the same behavior as people who are. You can use that for people who are interested in Arabic, who want to travel to the Middle East, who are disillusioned with the U.S. government.

What House of Cards got right: Data scientists can indeed use techniques to find individuals who fit into a certain category. In the absence of actual data for both the categories of "owns guns" and "definitely do not want to own guns," it’s likely that a data scientist would use cluster analysis to identify different groups of voters, see where the gun owners tend to live, eat, and shop, then start looking at those who live, eat, and shop in the same places but don’t own a gun. Cluster analysis involves looking at data and identifying groups by looking at observations or individuals that appear "close to" one another. One commonly cited real-world example of cluster analysis is when statisticians at Target knew a girl was pregnant before her father did because her shopping habits matched those of (read: were "close to") other pregnant women.

What House of Cards got wrong: Where someone takes their phone – i.e., the geolocation patterns Aidan mentioned – isn’t the only method by which data scientists can make predictions. In fact, the Target pregnancy prediction model was based on shopping habits. Data scientists will frequently use variables like purchase history and demographic information to organize consumers, voters, and even fonts based on how "similar" they are. One cool example is this font map designed by Ideo, which relies on artificial intelligence to visually lay out how similar different fonts are.

Credit to Joseph Nelson for putting this GIF (pronounced "jif") together.
Credit to Joseph Nelson for putting this GIF (pronounced "jif") together.

Conclusion

All in all, House of Cards did a good job of representing best data science practices. The show’s writers didn’t treat data science as a deus ex machina to resolve the story. Honestly, the "what House of Cards got wrong" bits are mostly about providing a more complete picture of how data science operates rather than misstatements by the showrunners. This isn’t surprising for an organization who not only employs data scientists, but hosted a challenge to use data science to improve its recommendation algorithm.

I mean, even Claire Underwood knows what’s up with p-values.

Ba-dum-tiss.
Ba-dum-tiss.

Thanks to Joseph Nelson, Laura Leebove, and Rebecca Louie for their edits, resulting in a much improved article!


Related Articles