Bayes’ classifier with Maximum Likelihood Estimation

Suhyun Kim
Towards Data Science
7 min readJul 6, 2018

--

The essential concept of supervised learning is you are given data with labels to train the model. And we assume that there is an optimal and relatively simple classifier that maps given inputs to its appropriate classification for most inputs. After training your model, the goal is to find an approximation of a classifier that works just as well as an optimal classifier so that the same classifier can be used with unlabeled/unseen data.

Statistical Model Approach

In the beginning, labeled training data are given for the training purposes. With the testing data, a certain probability distribution is assumed and its required parameters are pre-calculated to be used in the classifier.

http://www.cs.columbia.edu/~verma/classes/ml/lec/lec1_intro_mle_bayes_naive_evaluation.pdf

When initial data are given, assumption here is that data are picked INDEPENDENTLY and IDENTICALLY DISTRIBUTED (i.i.d.) Then the data type is checked to decide what probability model can be used. For example, if the data is coin tosses, Bernoulli model is used, if it’s dice rolls, multinomial model can be used. In my example below, Gaussian model, which is most common phenomenon, is used. In order to make sure the distribution is normal, the normality test is often done.

In the learning algorithm phase, its input is the training data and the output is the parameters that are required…

--

--