Understanding Maximum Likelihood Estimation (MLE)
What Is It? And What Is It Used For?
The first time I learned MLE, I remember just thinking, “Huh?” It sounded more philosophical and idealistic than practical. But it turns out that MLE is actually quite practical and is a critical component of some widely used data science tools like logistic regression.
Let’s go over how MLE works and how we can use it to estimate the betas of a logistic regression model.
What Is MLE?
At its simplest, MLE is a method for estimating parameters. Every time we fit a statistical or machine learning model, we are estimating parameters. A single variable linear regression has the equation:
Y = B0 + B1*X
Our goal when we fit this model is to estimate the parameters B0 and B1 given our observed values of Y and X. We use Ordinary Least Squares (OLS), not MLE, to fit the linear regression model and estimate B0 and B1. But similar to OLS, MLE is a way to estimate the parameters of a model, given what we observe.
MLE asks the question, “Given the data that we observe (our sample), what are the model parameters that maximize the likelihood of the observed data occurring?”