
Bias Correction For Paid Search In Media Mix Modeling: Linked Paper
Media Mix Modeling attempts to estimate the causal effect of media spend on sales, solely based on observational data. And, as we all know estimating causal effects from observational data is fraught with challenges.
Over time, two leading, and complimentary, frameworks have emerged for dealing with causal inference.
- Rubin’s Potential Outcome Framework.
- Pearl’s Graphical Framework.
This paper explores the use of Pearl’s graphical framework to control for selection bias in media mix modeling, specifically in paid search ads.
Problem Setup
Suppose we are aiming to measure the causal impact of search advertising (PPC) on sales. In a simple regression model, we can regress spend on sales and produce a causal estimate:
sales= average_sales + roas_estimate * search_spend + error
We can fit the model above with ordinary least squares (OLS), and in a simple world this would produce an accurate estimate of ROAS. Unfortunately, the world isn’t so simple and we know that there are often confounding variables. For example, we know that organic search is also a driver of sales and that both organic search and paid search have an underlying cause i.e. consumer demand. The graph below illustrates this.

Above, we can see how economic factors drive consumers demand, which in turn drives search queries, which in turn drive paid search and organic search.
When we use our simple model, defined above, to model the more complex world pictured above we run into the problem of selection-bias or more broadly endogeneity.
Endogeneity: is a term used in econometrics and refers to an explanatory variable that is correlated with the error term.
Put simply, our model is capturing, or absorbing, the explanatory value of organic search in both the error term and the ROAS estimate, thus producing and biased ROAS estimate.
To control for this, the paper suggests using Pearl’s backdoor criterion. Pearl’s backdoor criterion is based on the idea of using graphical models to describe the causal relationships. Graphical models are useful as they allow us to incorporate domain knowledge and ideas from graph theory.
One such idea id D-separation.
D-separation is short for directed-separation and it allows us to communicate if two nodes in a graph are separated (or conditionally independent of each other) by a set third nodes.
For example, in the graph below, we can say Z is d-separated, or conditionally independent, from Y given x1 and x2.

Another important idea is the backdoor criterion.
Backdoor criterion: Given a causal diagram, a set of variables Z satisfies the back-door criterion relative to an ordered pair of variables (X, Y ) in the diagram if: 1) no node in Z is a descendant of X; and 2) Z "blocks" every path between X and Y that contains an arrow into X.
Furthermore, if a set of nodes, Z, satisfies the backdoor criterion for the directed pair (X →Y) we can uncover an unbiased estimate of X on Y, given a large enough data set. This is also defined as identifiability.
To familiarize yourself with the concept of backdoor criterion, I recommend playing with the following code snippet (try creating various graphs and determine what nodes would satisfy the backdoor criterion) and exploring additional resources.
from causalgraphicalmodels import CausalGraphicalModel
simple = CausalGraphicalModel(
nodes=["x1", 'x2', 'z', 'y'],
edges=[
("z", "x1"),
("z", "x2"),
("x1", "y"),
("x2", "y"),
]
)
simple.draw()
simple.is_valid_backdoor_adjustment_set("x1", "y", {"z"})
Application to Search
Now that we’ve explored some of the basic concepts related causal graphical models, we can see how they are applied to recover an unbiased ROAS estimate for paid search. To begin, the paper illustrates the causal structure of the environment. (seen below)

The above diagram suggests we are modeling the impact as:
sales= average_sales + _roasestimate * search_spend + error_0 + error_1
Where error_0 and error_1 absorb the impact of consumer demand and organic search, respectively.
Based on our knowledge of graphical models, we now know that, if we can recover a measure of search queries, we can satisfy the backdoor criterion for (search ad X → Sales). To test this, we can use a handy package called: causal graphical models.
from causalgraphicalmodels import CausalGraphicalModel
search = CausalGraphicalModel(
nodes=["economic_factors", "consumer_demand", "search_queriers", "auction", "organic_search", "paid_search", 'organic_search', 'sales'],
edges=[
("economic_factors", "consumer_demand"),
("consumer_demand", "sales"),
("consumer_demand", "search_q"),
("search_q", "auction"),
("auction", "paid_search"),
("search_q", "paid_search"),
("search_q", "organic_search"),
("organic_search", "sales"),
("paid_search", "sales"),
]
)
search.is_valid_backdoor_adjustment_set("paid_search", "sales", {"search_q"})
# output is True
In the code above, we define our causal graphical model (DAG) and test if our control variables satisfy the backdoor criterion for (paid_search → sales).
Next, we generate sample data and run and OLS regression to compare estimates when we satisfy the backdoor criterion and when we don’t.
from causalgraphicalmodels import StructuralCausalModel
from causalgraphicalmodels import CausalGraphicalModel
import numpy as np
# create structural causal model. This let's us generate data.
search_data = StructuralCausalModel({
"consumer_demand": lambda n_samples: np.random.normal(100,5, size=n_samples) ,
"search_q": lambda consumer_demand, n_samples: np.random.normal(consumer_demand*.3, 1, n_samples) ,
"organic_search": lambda search_q, n_samples: np.random.normal(search_q*.6, 1) ,
"paid_search": lambda search_q, n_samples: np.random.normal(search_q*.1, 1) ,
"sales": lambda organic_search, paid_search, n_samples: np.random.normal(75 + organic_search*.2 + paid_search*.3, 1 )
})
data = search_data.sample(156)
# run OLS without backdoor criterion satisfied for paid search --> sales
X = data[['paid_search' ]].values
X = sm.add_constant(X)
results = sm.OLS(data.sales.values, X).fit()
print(results.summary())
# run OLS without backdoor criterion satisfied for paid search --> sales
X = data[['paid_search' ]].values
X = sm.add_constant(X)
results = sm.OLS(data.sales.values, X).fit()
print(results.summary())
# with backdoor criterion satisfied
X = data[['paid_search', 'search_q']].values
X = sm.add_constant(X)
results = sm.OLS(data.sales.values, X).fit()
print(results.summary())
Resulting in the following ROAS estimates:

As we can see, both estimates capture the true parameter with the unbiased estimator (backdoor criterion satisfied) coming much closer to the true estimate.
Now, you may have noticed, in the code sample, that we only sampled 156 data points, which would be equal to three years worth of weekly MMM data. This isn’t is lot of data and brings up an important point i.e. how do we know when our sample size is enough?
The paper suggests that this concern may be alleviated "when sample size is sufficiently large to allow for non-parametric estimates", however large sample sizes are not common in MMM modeling.
To explore this further, I’ve created the graphs below show how the ROAS estimates and confidence intervals change given increasingly larger sample sizes.


As we can see, the unbiased estimator converges to the true parameters, whereas are biased estimator is overly optimistic. Additionally, the graphs above highlight how the small sample sizes produce very large confidence intervals. Something to take note of if the sample size is small.
At this point, we’ve covered the meat and bones of the theoretical portion of the paper and covered:
- Selection-bias in paid search and MMM modeling
- Causal Graphical Models / Pearls Framework
- How to apply to a simple paid search scenario
- How to simulate data and implement models
- A few "gotchas" to watch out for
The paper goes into more detail about the subjects covered here and goes on to cover:
- Complex scenarios
- Implementation
- Empirical Results
I highly recommend the interested reader check out the full paper for more details.