An Introduction to Estimating the Causal Effects of Feasible Interventions

A flexible approach to causal inference

Nick Williams
Towards Data Science
6 min readJun 18, 2020

--

Photo by Drew Beamer on Unsplash

Imagine you’re applying to law school. You’ve taken the LSAT, but didn’t score as high as you want. You start to wonder, what would be the effect of raising your LSAT score on your admissions outcome (defined as the proportion of schools you applied to that you were accepted at)?

At first glance, this seems easy. You find some data from previous years of applicants and regress the admissions outcomes on LSAT scores. Great, you have a measure of association between LSAT scores and the proportion of schools one can expect to be accepted to.

But… did that really answer your question? Is your estimate unbiased? You think back to the assumptions of linear regression and realize the proportion of schools you can be accepted to is bounded between 0 and 1, so it can’t be linear. You also remember the canned interpretation of a regression coefficient, “For every unit increase in X, Y increases or decreases by β”; how can this interpretation hold if someone maxed out their LSAT score?

With this in mind, you start to narrow in on your research question: what would be the effect of raising LSAT scores on law school admissions outcomes if LSAT scores were raised by a realistic amount? We can…

--

--

Research Biostatistician at Weill Cornell Medicine. Interested in non-parametric causal inference and open source software design.