Getting Started

The “Bias-Variance Trade-Off” Explained Practically (In Python)

If you ever felt confused about the Bias-Variance Trade-Off, it’s because you have always read theoretical explanations. It’s simpler than it seems — with the aid of few lines of Python.

Samuele Mazzanti
Towards Data Science
7 min readJul 31, 2021

--

[Figure by Author]

The “bias-variance trade-off” is one of the most frequent topics in data science interviews. Still, many candidates struggle to understand the concept in depth. I guess it happens because this topic is always explained from a fully theoretical point of view.

However, I believe that the best way to understand something is to do it yourself — or, better, to code it yourself.

You don’t really understand it, until you can’t code it!

In this article, with the aid of some data, we will see what the bias-variance trade-off means in practice, and how to compute it in Python (both from scratch and using an out-of-the-box implementation).

Bias-variance formula: from theory to practice

The usual definition of the Bias-Variance decomposition is:

where MSE stands for Mean Squared Error and θ represent the parameters of the model (for instance, in linear regression, θ would be the vector containing all the regression coefficients).

But there is a problem: we can never observe the true value of θ. Moreover, in some models, it’s not possible to find explicitly what θ is. So this definition is quite useless from a practical point of view.

However, what we observe in real life is the ground truth, which is the realization of the target variable on some test data (often called y). So, from our perspective, it makes much more sense to replace θ with y, and obtain the following equation:

where:

This formula is much more convenient because in a real-life setting we actually know all these quantities.

Also, with this version of the formula, we are able to give a more informal interpretation of MSE, Variance and Bias:

--

--

Applied Scientist | I write original content for data science practitioners.