The statistical foundations of machine learning

A look beyond function fitting

Tivadar Danka
Towards Data Science
17 min readFeb 5, 2020

Developing machine learning algorithms is easier than ever. There are several high-level libraries like TensorFlow, PyTorch or scikit-learn to build upon, and thanks to the amazing effort of many talented developers, these are really easy to use and require only a superficial familiarity with the underlying algorithms. However, this comes at a cost of deep understanding. Without proper theoretical foundations, one quickly gets overwhelmed with the complex technical details.

My aim is demonstrate how seemingly obscure and ad-hoc methods in machine learning can be explained with really simple and natural ideas when we have the appropriate perspective. Our mathematical tool for that is going to be probability theory and statistics, which lies at the foundation of predictive models. Using the theory of probability is not just an academic exercise, it can actually provide very deep insight into how machine learning works, giving you the tools to improve on the state of the art.

Before we begin our journey in the theoretical foundations of machine learning, let’s look at a toy problem!

Fitting models

Suppose that we have two numerical quantities, say x and y, that are in relation with each other. For…

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Written by Tivadar Danka

I want to democratize machine learning. Math PhD with an INTJ personality. Chaotic good.

No responses yet

What are your thoughts?