
Table of Contents
- Introduction
- Motivation Behind Precision and Recall
- What are Precision and Recall?
- How Not To Get Confused?
- Why Precision and Recall over Accuracy?
- When to use Precision and Recall?
- Related Resources
- Conclusion
1. Introduction
In my previous post, I wrote about accuracy as an evaluation metric for Binary Classification models. I used the cancer prediction example to illustrate that accuracy was not enough to assess how a model performs in predicting the minority class (i.e. class of interest or positive class), especially in datasets with class imbalance. The reason is that accuracy does not distinguish the minority class from the majority class (i.e. negative class).
In this post, I will share how precision and recall can mitigate this limitation of accuracy, and help to shed insights on the predictive performance of a binary classification model. I will walk through these concepts using a simple example, step-by-step explanation and animated GIFs (p.s. I’m a believer of simplifying things), and even share a tip on how not to get confused between the two. Are you ready? Let’s dive right in!
2. Motivation Behind Precision and Recall
Let’s start with understanding why precision and recall are important. We do that using a confusion matrix. In general, a confusion matrix for binary classification summarises model predictions into four distinct outcomes, as shown in Figure 1.

In the context of the cancer prediction example in my previous post, these outcomes describe the following four scenarios (see Figure 2):

- Scenario #1: The model predicts cancer for a patient with cancer (True Positive)
- Scenario #2: The model predicts cancer for a patient without cancer (False Positive)
- Scenario #3: The model predicts no cancer for a patient with cancer (False Negative)
- Scenario #4: The model predicts no cancer for a patient without cancer (True Negative)
Of these, Scenarios #1 and #4 are ideal but Scenarios #2 and #3 are wrong predictions and have undesirable outcomes. Here’s why:
- Scenario #2 represents False Positives (FPs). This means that of 900 patients who really do not have cancer, the model says 80 of them do. In real life, these 80 patients will probably undergo expensive and unnecessary treatments, at the expense of their well-being.
- Scenario #3 represents False Negatives (FNs). This means that of 100 patients who really have cancer, the model says 20 of them do not. The consequences of this are arguably worse because these 20 patients would go undiagnosed and fail to receive proper treatment.
As you can imagine, these two scenarios have very different but nonetheless significant consequences. This is not only for predicting cancer but also many other applications too. As much as we want all model predictions to fall within Scenarios #1 and #4, we know that no model is perfect in the real world. It is almost certain that model predictions end up having FPs and FNs. The objective then is to ensure as few FPs and FNs as possible, and the way to assess this is to use precision and recall.
2. What are Precision and Recall?
Having explained why precision and recall are important, let’s introduce them formally. Precision and recall are not difficult concepts to grasp per se, but it is easy to get lost in the nomenclatures of TP, FP, TN and FN, as well as their mathematical formula. I’ve created animated GIFs to help you better visualise how precision and recall are computed.


It is always easier to grasp concepts if we contextualise them with examples. So, let’s do that using our cancer prediction example.
- Precision is about asking the question, "Of all patients predicted to have cancer, how many really have cancer?" In our example, 160 patients were predicted by the model to have cancer, but only 80 of them really have cancer. Precision is therefore 0.5.
- Recall is about asking the question, "Of all patients who really have cancer, how many were predicted to have cancer?" In our example, there are 100 patients who really have cancer. Of these, the model predicted correctly for 80. Recall is therefore 0.8.
I mentioned above that precision and recall allow us to assess the extent of errors contributed by FPs and FNs. Let me explain. Formally, precision and recall are given by:
Notice how FP and FN appear in the denominators for precision and recall respectively. What this implies is:
- The fewer the FPs, the higher the precision; and
- The fewer the FNs, the higher the recall.
What this also means is that if there are no FNs and FPs at all, i.e. model making perfect predictions, both precision and recall will be 1. In reality, this is difficult to achieve. There is also a trade-off between precision and recall – increasing precision results in lower recall, and vice versa. In practice, the closer precision and recall are to 1, the better the model’s performance.
4. How Not To Get Confused?
It is easy to confuse precision and recall because they are so similar. I’ve confused them countless times and had to Google each time to know which is which… until I came up with a simple trick. Here’s the trick to help you remember them better:
Precision starts with the letter "P", so we associate it with the word predicted. On the other hand, recall begins with the letter "R", so we associate it with the word really.
- When you think about precision, think about "P" preceding "R" (i.e. predicted before really), like this:
"Of all predicted as positive cases, how many are really positive?"
- Whereas, for recall, think about "R" preceding "P" (i.e. really before predicted):
"Of all cases which are really positive, how many are predicted as positive?"
Something that I realised would not help was trying to memorise the mathematical formula for precision and recall. Of course, go ahead and do that if it works for you. Otherwise, chances are that you will get TP, FP, FN and TN mixed up, like I always did. Just think about this trick I shared and then visualise in your head how precision and recall are to be derived (see the animated GIFs in Figures 3 and 4). It’s better to understand how they are derived, then plainly memorising 🙂
5. Why Precision and Recall over Accuracy?
(1) Assessing the extent of prediction errors
As explained above, precision and recall allow us to assess the extent of errors contributed by FPs and FNs. Given that these two types of errors can have very different impacts, it makes sense to have separate metrics to evaluate the extent of each error. Accuracy cannot be used in this regard because it implicitly assumes both types of errors to be of equal importance², which we know is not the case.
(2) Assessing predictive performance on minority class
We established in my previous post that accuracy was not enough to assess how a model performs in predicting the minority class, because it does not differentiate it from the majority class. Precision and recall, however, does the exact opposite. They focus on correctly predicted positive class (notice how the numerator for both formula is "TP"). On the contrary, they really don’t care about correctly predicted negative class ("TN" does not appear at all in either formula).
6. When to use Precision and Recall?
So, should you always use precision and recall? A few simple things to consider are:
- Is it a binary classification problem?
- Is your training dataset imbalanced across the classes?
- Is there a particular class of interest (i.e. minority class) in your training dataset?
If the answers to the above are "yes", then know it right away that you should ditch accuracy, and use precision and/or recall. Of course, in reality there could be other factors to consider, but for now, this would be a good starting point.
7. Related Resources
I’d like to acknowledge that the following article excellently written by Boaz Shmueli helped me a lot when I first started learning about Classification Metrics. I highly recommend it:
Multi-Class Metrics Made Simple, Part I: Precision and Recall
The other resources I found useful are:
- Model Evaluation I: Precision And Recall by Tejumade Afonja
- Precision vs. Recall – An Intuitive Guide for Every Machine Learning Person by Purva Huilgol
8. Conclusion
Thank you for coming this far! I hope you have benefited from this post and have gained a better understanding of Precision and recall – two very important metrics to evaluate classification models. Feel free to save the cheatsheet below for your future reference.

There are still a lot more about precision and Recall which I’ve yet to cover in this post. Which should you use – precision or recall? Which is better? How do you select a model using both precision and recall? What about multi-class classification problems? In my subsequent posts, I hope to go beyond the fundamentals to address those questions. See you in my next posts!
I like to break Data Science concepts down into simple, bite-sized chunks with clear and intuitive explanations. After all, that’s how I found myself learning most effectively. By sharing how I simplify concepts, I hope to help people lower their barriers of entry into learning Data Science. If you’ve found this post useful, feel free to let me know in the comments! I welcome __ discussions, questions and constructive feedback too. You can also connect with me via LinkedIn. Have a great day!
References
- Mike Wasikowski and Xue-wen Chen. Combating the Small Sample Class Imbalance Problem Using Feature Selection. IEEE Transactions on Knowledge and Data Engineering, 22(10):1388–1400, October 2010. ISSN 1041–4347.
- Foster Provost and Tom Fawcett. Data Science for Business. O’Reilly Media, Inc., first edition, December 2013.