with Python
If you google: "ROC curve machine learning", you get a Wikipedia answer like this:
A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied
Another common description is that the ROC Curve reflects the sensitivity of the model across different classification thresholds. When I was starting out in Machine Learning, these definitions would always confuse me.
In this article, I will share how I learned to disentangle my "beginner-like" confusions and develop a good enough intuition about the ROC curve.
What a Heck is the ROC Curve?
One way to understand the ROC curve is that it describes a relationship between the model’s sensitivity (the true-positive rate or TPR) versus it’s specificity (described with respect to the false-positive rate: 1-FPR).
Now, let’s disentangle each concept here.
The TPR, known as the sensitivity of the model, is the ratio of correct classifications of the "positive" class divided by all the positive classes available in the dataset, mathematically:

while the FPR is the ratio between false positives (number of predictions misclassified as positives) and all the negative classes available, mathematically:

So in essence, you are comparing how the sensitivity of the model changes with respect to the false-positive rate across different threshold scores that reflect a decision boundary of the model to classify an input as positive.
Decoupling the Issue with The Threshold Scores
The intuition that alluded me, in the beginning, was to grasp the role of the threshold score. One good starting point is to build a mental picture:

With this classic visualization, one can learn the first intuition, which is that the ideal model is one with the true positive rate as high as possible while keeping the false positive rate as low as possible.
The threshold corresponds to some value T (like a value between 0 and 1 for example) that serves as the decision boundary for the classifier and it affects the trade-off between TPR and FPR.
Let’s write some code to get a visual of all of these components.
Visualizing the ROC Curve
The steps to visualize this will be:
- Import our dependencies
- Draw some fake data with the
drawdata
package for Jupyter notebooks - Import the fake data to a pandas dataframe
- Fit a logistic regression model on the data
- Get predictions of the logistic regression model in the form of probability values
- Set different threshold scores
- Visualize the roc curve plot
- Draw some final conclusions
1. Import our dependencies
from drawdata import draw_scatter
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_recall_curve,precision_score, plot_roc_curve
2. Draw some fake data with the drawdata
package for Jupyter notebooks
draw_scatter()
Output:

3. Import the fake data to a pandas dataframe
df = pd.read_csv("./data.csv")
4. Fit a logistic regression model on the data
def get_fp_tp(y, proba, threshold):
"""Return the number of false positives and true positives."""
# source: https://towardsdatascience.com/roc-curve-explained-50acab4f7bd8
# Classify into classes
pred = pd.Series(np.where(proba>=threshold, 1, 0),
dtype='category')
pred.cat.set_categories([0,1], inplace=True)
# Create confusion matrix
confusion_matrix = pred.groupby([y, pred]).size().unstack()
.rename(columns={0: 'pred_0',
1: 'pred_1'},
index={0: 'actual_0',
1: 'actual_1'})
false_positives = confusion_matrix.loc['actual_0', 'pred_1']
true_positives = confusion_matrix.loc['actual_1', 'pred_1']
return false_positives, true_positives
# train test split on the fake generated dataset
X = df[["x", "y"]].values
Y = df["z"].values
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=42)
y_test = np.array([1 if p=="a" else 0 for p in y_test])
y_train = np.array([1 if p=="a" else 0 for p in y_train])
# create the model
lgr = LogisticRegression()
lgr.fit(X_train, y_train)
5. Get predictions of the logistic regression model in the form of probability values
y_hat = lgr.predict_proba(X_test)[:,1]
6. Set different threshold scores
thresholds = np.linspace(0,1,100)
7. Visualize the roc curve plot
# defining fpr and tpr
tpr = []
fpr = []
# defining positives and negatives
positives = np.sum(y_test==1)
negatives = np.sum(y_test==0)
# looping over threshold scores and getting the number of false positives and true positives
for th in thresholds:
fp,tp = get_fp_tp(y_test, y_hat, th)
tpr.append(tp/positives)
fpr.append(fp/negatives)
plt.plot([0, 1], [0, 1], linestyle='--', lw=2, color='r',label='Random', alpha=.8)
plt.plot(fpr,tpr, label="ROC Curve",color="blue")
plt.text(0.5, 0.5, "varying threshold scores (0-1)", rotation=0, size=12,ha="center", va="center",bbox=dict(boxstyle="rarrow"))
plt.xlabel("False Positve Rate")
plt.ylabel("True Positive Rate")
plt.legend()
plt.show()

8. Draw some final conclusions
By varying the threshold scores we get increasing values of both true positive and false-positive rates. A good model is one where the threshold score puts the true positive rate as close as possible to 1 while keeping the false positive rate as low as possible.
But, how could we choose the best classification threshold?
A simple method is to take the one with the maximal sum of true positive and false-negative rates (1- FPR).
Another criterion could be to simply choose the point closest to the top left corner of your ROC space. However, that implies that the true positive rate and the true negative rate have the same weight ([source](https://stats.stackexchange.com/questions/123124/how-to-determine-the-optimal-threshold-for-a-classifier-and-generate-roc-curve#:~:text=A simple method is to,thresholds like financial costs, etc.&text=Choose the point closest to,corner of your ROC space.)), which is not necessarily true in cases like cancer classification where the negative impact of a false positive is bigger than the impact of a true positive.
Final Thoughts on ROC Curves
I think taking some time to digest evaluation metrics is extremely beneficial in the long run for your Machine Learning journey. In this article you learned:
- A basic intuition on how ROC curves work
- How classification thresholds affect the relationship between the sensitivity and specificity of a model
- An intuition on how to use the ROC curve to set an optimal classification threshold
If you want to learn more about Python for Machine Learning, check out these courses:
These are affiliate links, if you use them I get a small commission, cheers! 🙂
- Machine Learning A-Z™: Hands-On Python & R In Data Science
- Python for Data Science and Machine Learning Bootcamp
If you liked this post, follow me on Medium, subscribe to my newsletter, connect with me on Twitter, LinkedIn, Instagram, or join Medium! Thanks and see you next time! 🙂