How to avoid the Machine Learning blackbox with SHAP

Published in

Towards Data Science

5 min readDec 27, 2018

I believe we all agree with the idea that Machine Learning has brought tremendous improvement to tech products and therefore to the well-being of us all. However, companies and researchers who work on Machine Learning face one huge challenge: “blackbox algorithms”.

Blackbox algorithms can be loosely defined as algorithms whose output is not easily interpretable or is non-interpretable altogether. Meaning you get an output from an input but you don’t understand why. Not to mention GDPR’s reinforcement of the “right to explanation”, avoiding blackbox algorithms is a matter of transparency and accountability — two values that any company should strive for.

With that in mind, a very good tool I recently got to explore was SHAP, which stands for SHapley Additive exPlanations.

What is SHAP?

This framework enables any Machine Learning algorithm to essentially become “explained” by providing intuitive and interactive visualizations that aim at showing which features are more relevant for a certain prediction and for the model as a whole.

I won’t dive in the theoretical part too much — the research team at UWashington behind SHAP explain it in their NIPS paper — but the basic idea is to train a much simpler and interpretable model (i.e. a linear regression or decision tree) on top of the originally trained model so that it approximates the original predictions. That way, these predictions can now be explained by the simpler model.

I want to share with you my implementation of the framework on an open dataset and what type of explanations can we expect from it.

What does SHAP teach us?

I started out with the Boston dataset (check out the link for meanings of feature acronyms). This dataset contains information collected by the U.S. Census Service concerning housing in the area of Boston, where the target variable is the median house value. I trained an XGBoost learner on this dataset with hyperparameter optimization using GridSearch.

Now, let’s take a look at the output from SHAP.

Prediction Explainer

This first chart aims at explaining individual predictions. In this case, I selected the first sample of the test set. We can see that the model predicted a median house value of 16.51 (16 510$). Additionally, we can see which features contribute to getting that value higher (red) or lower (blue). In this case, LSTAT being equal to 4.98 is the most defining feature that corroborates the target variable, meaning it improves the prediction value. On the other hand, the values of RM, NOX and DIS push the prediction down.

Model Explainer

This chart has the purpose of explaining the model as a whole. It essentially has all the samples plotted on the x-axis (in this case, ordered by similarity, but it can be changed in the combo box) and their prediction values plotted on the y-axis. Also, it has the individual contributions of each feature for each sample, based on feature value.

In this example, I selected sample number 60 (x-axis), which has a prediction value of 10.01 (y-axis) and where RM and LSTAT are the most relevant features — they push the prediction down though. However, by just hovering over other samples, it’s possible to see how feature values and their impact change, as well as the predictions.

Dependence Plot

In this chart we can see how 2 features are related to one another in terms of their impact in the model, measured by a SHAP value (a measure of feature relevance in the model). SHAP values can also be seen as an odd — a value of -2 means observing that feature lowers your log odds of winning by 2, where “winning” here just means having a more expensive house.

In this case we can see that for values of RM below 7 (x-axis), the SHAP values (y-axis) are virtually always negative, which means lower values of this feature push the prediction value down. Also, if you have an RM equal to 6, then you can have a SHAP value between -2.5 and 0, depending on the value of RAD. This means that having less rooms (RM) is less bad if you also have a good accesibility to highways (RAD). Notice that this vertical spread narrows a lot with an RM below 5 and above 7, which proves other features (like RAD) lose a lot of the impact they had.

Summary Plot

These two charts show a different visualization of the same output, which is a general evaluation of feature relevance on the model as a whole. The first chart has some additional information, since it considers the feature relevance for each individual prediction (depicted as a point). Also, for each feature, it shows how a higher or lower value of that feature influences the fact that it agrees or disagrees with the prediction.

As an example, we can see that LSTAT is by far the most influential feature, be it positively or negatively. In this case, lower values of LSTAT seem to influence the prediction positively. We can also see that RM has the opposite behaviour — higher values lead to higher prediction values.

This all makes sense when we consider the meaning of the acronyms. It’s expected that a lower percentage of lower status of population (LSTAT) and a higher number of rooms (RM) both lead to more expensive houses.

Wrapping up…

There’s really nothing like trying, right? My suggestion: start with an open dataset, maybe a classification one such as the Iris dataset, train an adequate model and run the framework.

HINT: also, take a look at Scott Lundberg’s post on TDS, he’s one of the original creators of SHAP 😉