Source: Unsplash

Machine Learning

Confusion Matrix — What is it?

Computing a matrix needed to evaluate binary classification

Sujeeth Kumaravel
5 min readJun 22, 2019

--

You have a binary classification problem at hand. Let’s denote the two classes in the target variable as ‘Negative’ and ‘Positive’. You have the dataset to be used to develop the classifier, have performed exploratory data analysis, feature engineering, and have come to a conclusion on what model should be trained. You have divided your data into training and test sets and used the training set to train your model. Then, you feed each instance in your test data to your trained model and get one of the two classes as the corresponding output.

Now, how do you know if your model is performing well? Not all the test set rows classified as ‘Negative’ by the model will actually be ‘Negative’. The same applies to ‘Positive’. In other words, how do you evaluate yourself? This is where a confusion matrix comes into play.

A confusion matrix is a 2x2 matrix with the following structure:

The element number of true negatives is the number of rows classified by the model as ‘Negative’ but are actually ‘Negative’.

The element number of false negatives is the number of rows classified by the model as ‘Positive’ but are actually ‘Negative’.

With the same reasoning, we can understand what number of true positives and the number of false positives mean.

The column label Negative (predicted) says the first column contains the numbers of predicted ‘Negatives’ & the column label Positive (predicted) says the second column contains the numbers of predicted ‘Positives’. Similarly, the row labels Negative (actual) & Positive (actual) say that the two rows contain the numbers of actual ‘Negatives’ and ‘Positives’.

Let us take see an example. Assume that the test set contains 100 rows out of which 75 are ‘Negative’ and the remaining are ‘Positive’. Assume that the trained model classified 50 of the actual ‘Negative’ and 15 of the actual ‘Positive’ correctly. This means it classified 25 of the actual ‘Negative’ and 10 of the actual ‘Positive’ incorrectly. This means,

number of true negatives = 50

number of false negatives = 25

number of true positives = 15

number of false positives = 10

So the confusion matrix for this example will look like:

For simplicity of representation, let us represent the number of true negatives as TN, number of false negatives as FN, number of true positives as TP, and number of false positives as FP. So the simplified representation of a confusion matrix is:

It is not hard to understand that the sum of the number of true negatives (TN)and the number of false negatives (FN) is equal to the total number of negatives predicted by the model, i.e. FN +TN = total number of predicted negatives.

Similarly, the sum of the number of true negatives (TN) and a number of false positives (FP) is equal to the total number of true negatives, i.e. TN + FP = total number of true negatives.

On the same lines of reasoning, we understand that,

FP + TP = total number of predicted positives

FN + TP = total number of actual positives.

Of course, the total number of predicted negatives + the total number of predicted positives = total number of test set rows

Similarly, the total number of actual negatives + the total number of actual positives = total number of test set rows.

Adding this information to the confusion matrix:

Adding this additional information to the confusion matrix in the example discussed earlier:

Let us see how to calculate a confusion matrix in Python using scikit-learn’s metrics module. Let us take 10 rows in the test set and denote the ‘Negative’ and ‘Positive’ as 0 and 1 respectively. In the following Python code, we are importing the confusion_matrix function from the metrics module. Then giving the actual and predicted classes as inputs to the confusion_matrix function we get the matrix as output.

Examining the actual and predicted lists in the code says,

number of true negatives (TN) = 1

number of true positives (TP) = 4

number of false negatives (FN) = 2

number of false positives (FP) = 3

total number of actual negatives (TN + FP) = 4

total number of actual positives (TP + FN) = 6

total number of predicted negatives (TN + FN) = 3

total number of predicted positives (TP + FP) = 7

Printing the confusion matrix in the code above gives the following output:

[[1 3]
[2 4]]

Well, the program gave the correct confusion matrix.

In my future articles, I am going to explain how to use the confusion matrix to evaluate the performance of a binary classifier.

Signing off now!

Further Reading:

The following post by neptune.ai gives a detailed explanation of 24 performance metrics that can be computed using a confusion matrix to evaluate binary classification:

References:

--

--

Engineer with interests in Software, ML/AI, Signal Processing, Spirituality. Senior Technical Lead at Mercedes Benz R&D