The world’s leading publication for data science, AI, and ML professionals.

A Stacked SCOTUS

How one simply engineered feature helps to predict U.S. Supreme Court decisions with 97% Accuracy

Photo by Sora Shimazaki from Pexels
Photo by Sora Shimazaki from Pexels

When Justice Amy Coney Barrett was confirmed to the United States Supreme Court on October 31, 2020, the Republican party let out a collective sigh of relief. A Conservative majority all but guaranteed a reversal of Roe v. Wade and other legislation that did not support the Republican agenda. Finally, the United States judicial branch could return to a sense of family values, limited government, a tax policy allowing for individual prosperity, and –

…Wait a second. Isn’t the U.S. Supreme Court supposed to be impartial?

Background

40 individual justices have served on the US Supreme Court (affectionately known as SCOTUS) since 1941, ranging from Justice Hugo Black to the recently appointed Justice Amy Coney Barrett. Appointed by the most controversial president of the 21st century in the middle of a nationwide pandemic, Justice Barrett has had no shortage of scrutiny for her previous actions and voting record. Landmark cases like Roe v. Wade (1972) and Obergefell v Hodges (2014) have a high chance of reversal, or at least legal analysts on CNN seem to think so.

Justice Barrett will soon vote on her first case surrounding a woman’s right to an abortion; although her voting record so far has proven to be less partisan than liberal critics had anticipated, all eyes are on the case outcome from FDA vs American College of Obstetricians and Gynecologists. The death of longtime SCOTUS member and civil rights icon Justice Ruth Bader Ginsburg eliminates an outspoken voice on abortion and other critical human issues, and many are fearful that Justice Barrett will bring a very different vote in her absence. __ This case will demonstrate the new Court’s established stance on this intersectional issue of public health and human rights.

Back in 2020 when faced with my Classification project at Metis, I was hearing a lot about how the political persuasion of SCOTUS justices strongly influenced the decisions of the court. A judge seeks to make a balanced, impartial decision – wouldn’t one of the most esteemed judiciary positions in the United States be occupied by a balanced, impartial judge? It seemed that this idealistic thought did not correlate to the world around me, but I didn’t yet have the data to back that up. Developing a classification model to predict how a Supreme Court might vote on a particular issue seemed like the perfect solution to enhance my understanding and flex some recently acquired skills in model development. I had no idea how just how predictable these decisions were.

(PS – if you need a crash course in how SCOTUS works (I certainly did!), check out this video by appropriately named YouTuber CrashCourse. He succinctly breaks down how SCOTUS comes to a decision for those of us who haven’t thought about the judiciary branch since our 10th grade Government class.)

The Data

LGBTQ+ activists raise concern thatJustice Barrett's votes may lead to reversal of civil rights cases like Obergefell v Hodges | Photo by Matt Popovich, Unsplash
LGBTQ+ activists raise concern thatJustice Barrett’s votes may lead to reversal of civil rights cases like Obergefell v Hodges | Photo by Matt Popovich, Unsplash

Thankfully, there is no shortage of data available about SCOTUS, nor similar projects in predicting classified SCOTUS decisions. The Supreme Court Database (SCBD), maintained by Washington University Law, maintains fastidious records on every vote cast in the history of the Supreme Court – over 120,000 votes since 1941, to be specific. Each observation in the dataset comes with 61 categorical features (independent variables) that accompany the decision, ranging from the "type" of defendant to the larger social or economic issue that the case addresses.

Case outcome was represented in a number of variables, including Declaration of Unconstitutionality, Case Disposition, and Decision Direction. Because it is the duty of SCOTUS to uphold the Constitution, all cases heard by the court are classified as constitutional, or unconstitutional. Case Disposition referred to the decision relative to the previous court’s ruling; reversal indicates the opposite ruling, and staying or affirming indicated that the decision remained the same as the lower court.

Decision Direction was the most interesting variable for this study, with only three possible outcomes – liberal, conservative, or unspecifiable. For example, an outcome that is pro-affirmative action would be considered liberal per the database’s standard, while a decision outcome that is anti-affirmative action would be considered conservative. SCDB lists all the various issues and how their outcome is classified if you’re interested in unpacking exactly what these labels mean.

Representation that, overall, SCOTUS is fairly bi-partisan with an almost 50/50 even split of case outcome dispositions since 1941 | Image by author
Representation that, overall, SCOTUS is fairly bi-partisan with an almost 50/50 even split of case outcome dispositions since 1941 | Image by author

With roughly 40,000 conservative decisions and 38,500 liberal decisions, the court seems, on the surface, to be quite partisan, but there was more to explore here. The SCDB includes helpful issue codes for the various cases heard by SCOTUS, ranging from criminal procedure to federal taxation. When these vote ratios are broken down by issue, they tell quite a different story.

Image by author
Image by author

While there is an almost 50/50 split of votes on the surface of SCOTUS decisions, the court has historically shown strong bias on a number of issues. In cases regarding Federal Taxation, 74% of rulings swayed liberal, while 67% of rulings on Privacy swayed conservative. Every single issue shows a certain bias from the court, regardless of the volume of cases heard over time within that particular issue. The variation in Decision Direction, specifically in relation to the issue of the case itself, made it my perfect target (dependent variable).

I had found my target variable in Decision Direction, the indicator of the overall qualitative political standpoint of a court’s decision. Now how to describe the justices themselves?

Martin-Quinn Score

Legal academics and citizens alike have long been interested in the complex nature of judge bias, and the landscape was changed with the introduction of a measure of this bias in 1999. Andrew D. Martin and Kevin Quinn created the Martin-Quinn score as a scaled indicator of the political leaning of a justice, based on previous votes during their tenure on the court. The scale has it’s own set of calculations, but the most important thing to know is that 6 indicates an extremely conservative justice, and -6 indicates an extremely liberal justice, with most justices falling between -4 and 4.

Martin-Quinn scores are calculated every year, and change over time based on the justice’s votes from the previous year. While Justice Ginsburg (represented by the green line below) started with a moderate Martin-Quinn score of -0.21 in 1993, her decisions became more liberal over time, resulting in a score of -2.82 during her last term in 2020. These scores have been fastidiously calculated and maintained since 1999, and are a commonly regarded metric of individual judge bias.

Visualization of bias over time from Oct 2020 SCOTUS | Image by Author
Visualization of bias over time from Oct 2020 SCOTUS | Image by Author

Martin-Quinn scores do a great job at indicating the political leaning of an individual justice, but how do these scores play together to indicate the ideological leaning of the entire court? What sort of complex process could I undergo in order to represent the nuance of a court’s overall political disposition?

Well… I’ve always been fond of addition. When the individual justice scores for each natural court by year are added together, a new value, called the Court MQ, is created. If you have seven justices on a court with respective Martin-Quinn scores of 4,2,1,-3, -4, 2 and 2, your Court MQ would be 4, indicating a very (but not extraordinarily) conservative court. This widens the metric, with most courts falling between -10 and 10 as opposed to the earlier values of -6 and 6, but represents the overall bias while weighting strong biases appropriately.

When we plot Martin-Quinn scores over time, the implications are staggering:

This visualization shows the dramatic change in SCOTUS bias since 1947 | Image by author
This visualization shows the dramatic change in SCOTUS bias since 1947 | Image by author

As the Court MQ score rises, a more conservative court is indicated; as it falls, a more liberal court is indicated. The court under Chief Justin Warren was the most liberal court in recent history, with appointments from JFK and Lyndon B Johnson in the 1960’s creating an even stronger bias. The court neutralized under Chief Justice Burger, and has been swayed conservatively ever since. When information is available for Amy Coney Barrett, a move back into the most conservative court since the early 1990’s is virtually guaranteed.

Court MQ is a striking and powerful feature in displaying court bias over time, and all it took to get that feature was addition. In the midst of this concerning data, a core truth of Data Science emerges: a simple feature can often be the most descriptive and striking feature in a model. The success of this model hinged on Court MQ, and this new metric certainly did not disappoint.

The Model

As mentioned above, the data contained over 60 categorical variables for each vote, but many of these seemed to be useful only for reference purposes, like case ID and docket ID. There were some substantial variables, however, and I found the most correlated variables to be issue, the newly manufactured Court MQ, and petitioner/respondent type. This final variable indicated the sort of person the petitioner (plaintiff) and respondent (defendant) was – a citizen, a Government body, or a larger federal entity. This felt like useful information for the model, and is described in more detail on the SCDB website.

(if you’re not a big data nerd like I am, feel free to skip this next section, where we get into the nitty gritty)

ROC Curves for multiple tested models | Image by author
ROC Curves for multiple tested models | Image by author

Multiple models were tested, ranging from Naive Bayes (whose core assumptions of entirely unrelated features did not line up with this study) to the ironclad and reliable Random Forest Classifier. Decision Tree/Random Forest, SVC, and KNN classification models were all excellent classifiers, but speed of prediction and a slightly higher recall and precision score made Random Forest the best choice for a Supreme Court decision predictor. The algorithm is particularly well-suited to the categorical variables of issue and petitioner/respondent type, while providing the same detailed attention to the continuous Court MQ. All in all, the Random Forest model draws on these fairly simple features to predict SCOTUS decisions within 97% accuracy (and identical scores on precision and recall).

That model was integrated into a Streamlit app (thanks to my Metis colleague Drew Hibbard and his fantastic article about deploying web apps!) and is now available as a predictor for landmark cases over time, or on an issue of the user’s choice. The app allows the user to build "A Stacked SCOTUS" and see how the court’s individual MQ scores can lead to some shocking overturned decisions. I won’t spoil it for you – try the Stacked SCOTUS web app yourself!

(and keep your eyes peeled for Amy Coney Barrett to be added as a justice! I can’t wait to see how these potential court decisions stack up…)

(another note: this web app is not at all intended to be confused with the great work being done by https://fantasyscotus.net/, a competition league for legal enthusiasts to also [you guessed it] predict SCOTUS decisions.)

The Fantasy SCOTUS app, utilizing the model described above to predict SCOTUS decisions | Image by author
The Fantasy SCOTUS app, utilizing the model described above to predict SCOTUS decisions | Image by author

Final Verdict

While it is gratifying to build an accurate model that utilized this new Court MQ metric, the implications of this data are enough to give pause to any concerned citizen. Those legal analysts on CNN aren’t lying; the sway and voting record of a U.S. Supreme Court justice has tremendous weight on the decision of the court as a whole. With such strong indication that there is bias on certain issues, and how much that bias has changed over time, it is the duty of the executive branch to consider the entire makeup of a court when appointing judges.

The justices appointed to the Court deserve tremendous respect for their storied careers and accomplishments that led them to the bench. However, even the highest court of the land is full of humans, and each of us carries our own biases. It is the duty of humanity in general, and especially powerful SCOTUS justices, to reckon with these biases and how they affect our civic responsibilities. When civil rights issues like same-sex marriage and abortion are heard by the court, this bias will play a critical role in determining their votes, and our future. Acknowledging and studying that bias through data can help the judicial branch understand how to move forward as a humanly partisan, but fair, arbiter of justice in the United States.

_(If you enjoyed the nuts-and-bolts talk, check out my GitHub repo for this project, specifically the code for the web app!)_


Related Articles