The world’s leading publication for data science, AI, and ML professionals.

Explainable AI: Interpretability of Machine Learning Models

Can you trust your Machine learning model?

Model Explainability using LIME

Photo by Andy Kelly on Unsplash
Photo by Andy Kelly on Unsplash

Why should we trust a machine learning model blindly? Wouldn’t it be wonderful if we can get a better insight into model predictions and improve our decision making? With the advent of Explainable Ai techniques such as LIME and SHAP, it is no longer a challenge. Nowadays, machine learning models are ubiquitous and becoming a part of our lives more than ever. These models are usually a black box in essence that it’s hard for us to assess the model behavior. From smart speakers with inbuilt conversational agents to personalized recommendation systems, we use them daily, but do we understand why they behave in a certain way? Given their ability to influence our decision, it is of paramount and supreme importance that we should be able to trust them. Explainable AI systems help us understand the inner workings of such models.


So, What’s Explainable AI?

Explainable AI can be summed up as a process to understand the predictions of an ML model. The central idea is to make the model as interpretable as possible which will essentially help in testing its reliability and causality of features. Broadly speaking, there are two dimensions to interpretability:

  1. Explainability(Why did it do that?)
  2. Transparency(How it works?)

Typically, explainable AI systems provide an assessment of model input features and identify the features which are driving force of the model. It gives us a sense of control, as we can then decide if we can rely on the predictions of these models. For instance, we would probably trust a flu identification model more if it considers features like temperature, and cough more significant than other symptoms.

Now that you have an idea of explainable systems, how do we explain model predictions?

There are different ways to do that. LIME is one of them. Let’s squeeze it.

LIME stands for: Local: Approximate locally in the neighborhood of prediction being explained, Interpretable: Explanations produced are human-readable, Model-Agnostic: Works for any model like SVM, Neural networks, etc Explanations: Provides explanations of model predictions.(Local linear explanation of model behaviour)

Photo by Jason Leung on Unsplash
Photo by Jason Leung on Unsplash

Lime can be used to get more insights into model prediction like explaining why models take a particular decision for an individual observation. It can also be quite useful while selecting between different models. The central idea behind Lime is that it explains locally in the vicinity of the instance being explained by perturbating the different features rather than producing explanations at the entire model level. It does so by fitting a sparse model on the locally dispersed, noise-induced dataset. This helps convert a non-linear problem into a linear one. The indicator variables with the largest coefficients in the model are then returned as the drivers of the score.


Installation

You can simply pip install it or can clone the Github repo:

pip install lime
pip install . (Git version)

Implementation

We will use Lime for explaining predictions of a random forest regressor model on the Diabetes dataset which is inbuilt in sci-kit learn. This post assumes that you already have some knowledge of Python, and Machine Learning. For the sake of simplicity, we will not cover all steps we usually follow in the model building pipeline like visualization and pre-processing. For the model building bit, You can clone the repo here.

So, let’s cut to the chase and see how can we explain a certain instance using Lime.

Understanding Model Behavior in predictions with Lime mainly comprises of two steps:

  • Initialize an explainer
  • Call _explaininstance

The first step in explaining the model prediction is to create an explainer. We can use Lime Tabular explainer, which is the main explainer used for tabular data. Lime scales and generates new data using locality and computes statistics like mean for numerical data and frequency for categorical data, due to this we need to pass our training data as a parameter.

In the second step, we simply need to call explain_instance for the instance in which you need explanations. You can use a different ‘i’ if you wish to understand a different instance.

Finally, we can use the explainer to display the explanation for a particular prediction in the Jupyter Notebook.

Lime output explaining the Instance (Image by Author)
Lime output explaining the Instance (Image by Author)

As we make our model complex, its interpretability decreases and vice-versa. A word of advice would be to take care of the trade-off between model complexity and its interpretability.

You can optionally save your explanations as HTML file which makes it easier to share.

exp.save_to_file("explanation.html")

Alternatives

  • Eli5 – Another Library for Model Explainability. I have used it for textual data and it works fine. You can read more on Eli5 here.
  • SHAP – Shapley Additive Explanations as the name suggests tells you how it got the score for an instance in an additive manner. SHAP has not only a generic explainer that works for any model but also a TreeExplainer for tree-based models. It theoretically guarantees consistency and is slower than Lime. Additionally, the computational requirements of exploring all possible feature combinations grow exponentially in SHAP.

Conclusion

Lime provides human-readable explanations and is a quick way to analyze the contribution of each feature and hence helps to gain a better insight into a Machine Learning model behavior. Once we understand, why the model predicted in a certain way, we can build trust with the model which is critical for interaction with machine learning. In this post, we used a Random Forest regression model to interpret its prediction on a particular instance.

Interestingly, Lime also supports an explainer for images, textual data, and classification problems. You can explore Lime explanations further in more complex models such as Xgboost & LightGBM and compare predictions. Read more on Lime here. Also, here’s an interesting read on different tools for transparency and explainability in AI.

I’d love to hear your thoughts about Lime, Machine learning, and Explainable AI in the comments below. If you found this useful and know anyone you think would benefit from this, please feel free to send it their way.


Related Articles