Model Interpretability
Making black box models a thing of the past

With all the complexity that comes with developing machine learning models, it comes as no surprise that some of these just don’t translate very well when being explained in plain English. The model inputs go in, the answers come out and no one knows how exactly the model arrived at this conclusion. This can result in some sort of disconnect or lack of transparency between different members working on the same team. As the prevalence of machine learning has increased in recent years, this lack of explainability when using complex models has grown even more. In this article, I’ll discuss a few ways to make your models more explainable to the average person whether they be your non-technical manager or just a curious friend.
Why is explainability important?
The responsibility that falls on machine learning models has only increased over time. They are responsible for everything from filtering spam in your email to deciding if you qualify for that new job or loan you’ve been looking for. When these models can’t be explained in plain English, a lack of trust ensues and people become reluctant to use your model for any important decisions. It would be a shame if the model you worked so hard to create ended up not being discarded because no one could understand what it was doing. In being able to explain a model and show insights that come from it, people (especially those with no background in Data Science) will be a lot more likely to trust and use the models that you create.
Interpreting Coefficients
On one end of the spectrum, we have simple models like linear regression. Models like this are quite simple to explain, with each coefficient representing how much a feature affects our target.

The image above shows the plot for a model represented by the equation y=2x. This just means that for an increase of 1 in feature x, the target variable will increase by 2. You can have multiple features like this; each one with its own coefficient representing its effect on the target.
On the other end, we have "black box" models like neural networks where all we can see are the inputs and outputs but the meanings and steps taken to get from input to output are effectively blocked by a sea of incomprehensible numbers.
Partial Dependence
Partial dependence shows how a particular feature affects a prediction. By making all other features constant, we want to find out how the feature in question influences our outcome. This is similar to interpreting coefficients explained in the previous section but partial dependence allows us to generalize this interpretation to models more sophisticated and complex than simple linear regression.
As an example, we’ll be using a decision tree on this Cardiovascular Disease dataset on Kaggle. The library we’ll be using to plot partial dependence is pdpbox. Let’s train the model and see how this all works.

The plot above shows the partial dependence plot for the feature age
. The target variable we are trying to predict is the presence of cardiovascular disease. We can see that as the age
feature goes above 19000 days (around 52 years), it starts to affect the prediction in a positive way (a higher age translates to a higher probability of cardiovascular disease). When thinking about this insight intuitively, the model makes sense and we are more likely to trust its predictions.
The decision tree we are using is still relatively simple and its partial dependency plot may not paint the whole picture. Let’s try again, this time with a random forest model.

Using a more complex model like random forest, we see that the age
feature affects our predictions more linearly as opposed to the ‘step-like’ prediction effect we saw when we used a simpler decision tree.
How does it work?
Partial dependence plots rely on a model that has been fit on the data we are working with. Let’s take a single row of our dataset as an example.

Our age
variable here has a value of 14501. The model will predict the probability of cardiovascular disease from this row of data. We will actually do this multiple times, changing the value of the variable age
every time we make a prediction. What is the probability of cardiovascular disease when age
is 12000? 16000? 20000? We keep track of these predictions and see how changing this variable affects the prediction. In the end, we do this for several rows and take the average prediction for different values of age
. We then plot these out and come up with the partial dependency plot seen above.
A Step Further
Now that we’ve seen how partial dependence works with a single variable, let’s look at how it works with feature interactions! Let’s say we wanted to see how height
and weight
interacted to affect our predictions. We can also use a partial dependency plot to see this interaction. We’ll use the same random forest model seen in the previous section. By changing our code a little bit, we’ll be able to come up with an entirely different-looking plot that helps us see feature interactions.

This plot not only looks pretty, but it also gives us a lot of information about how height
and weight
interact to affect our predictions. The variable height
has less of an effect since the color of the plot does not change much as we move across the x-axis. weight
seems to have a much stronger effect on the probability of cardiovascular disease as the predictions are positively affected as we move up the y axis. Once again, thinking intuitively this makes sense. A person with a higher weight would be more likely to have cardiovascular disease. With this insight from our model, we are that much more inclined to trust its predictions. We can do this with any two features we want to be able to answer different hypotheses we might have about our data.
Conclusion
We saw the importance of being able to explain machine learning models to a non-technical audience. When a model is distilled down to easily understandable insights, people are more likely to trust its predictions and use them in the long run. This can be very helpful when trying to gain traction for a machine learning project that people may be skeptical of.
We saw simple models like linear regression where predictions can be interpreted using model coefficients. We were able to see the same insights in more complex models using partial dependency plots. We were even able to see how two features interact with one another to affect the outcome of a prediction using interaction plots. With this knowledge in mind, let’s remove the stigma that machine learning models are getting too complex for human beings to understand!
Thank you for reading!
You can connect with me through these channels: