Explaining Text classifier outcomes using LIME

What makes your question insincere in Quora?

Maha Amami
Towards Data Science

--

Photo by Jules Bss on Unsplash

In the previous post on leveraging explainability in real-world applications, I gave a brief introduction to XAI (eXplainability in AI), the motivation behind it, and the application of explainable models in the real-life scenarios.

In this post, I will provide an introduction to LIME one of the most famous local explainable models and how to apply it to detect terms that make a question in the Quora platform insincere.

What is LIME and how it works?

The authors in [1] proposed LIME that is an algorithm explaining individual predictions of any classifier or regressor in a faithful and intelligible way, by approximating them locally with an interpretable model.

For instance, an ML model predicts that a patient has the flu using a set of features (sneeze, weight, headache, no fatigue, and age), and LIME highlights the symptoms in the patient’s history that led to the prediction (the most important features). Sneeze and headache are portrayed as contributing to the flu prediction, while no fatigue is evidence against it. With these explanations, a doctor can make an informed decision about whether to trust the model’s prediction.

Source: “Why Should I Trust You?”
Explaining the Predictions of Any Classifier [1]

Explaining a prediction is presenting textual or visual artifacts that provide qualitative understanding of the relationship between the instance’s components (e.g. words in text, patches in an image) and the model’s prediction [1].

Intuition behind LIME

LIME is a local surrogate model, which means that it is a trained model used to approximate the predictions of the underlying black-box model. But, it comes with the idea to generate variations of the data into the machine learning model and tests what happens to the predictions, using this perturbated data as a training set instead of using the original training data.

In other words, LIME generates a new dataset consisting of permuted samples and the corresponding predictions of the black-box model. On this new dataset, LIME then trains an interpretable model (e.g., Lasso, decision tree, …), which is weighted by the proximity of the sampled instances to the instance of interest.

The bold red cross is the instance being explained. LIME samples instances, gets predictions using the black-box model (represented by the blue/pink background), and weighs them by the proximity to the instance being explained (represented here by size). The dashed line is the learned local explanation [1].

Applying LIME to Quora dataset and Logistic regression model

The dataset of the Quora Insincere Questions Classification task could be downloaded from this link. The training data includes the question that was asked, and whether it was identified as insincere.

Let us look at two questions of this dataset and the corresponding classes (1 for insincere, 0 for sincere question):

  • Insincere question: Why does Trump believe everything that Putin tells him? Is he a communist, or plain stupid?
  • Sincere question: Can the strong correlation between latitude and prosperity be partially explained by another one (if proven to exist) between favourable ambient temperatures and brain enthropy?

The preprocessing step consists of splitting the data to train and validation sets, then to vectorizing the questions to tf-idf vectors.

Prerocessing code.

The black box model is a logistic regression model having as input the tf-idf vectors.

Logistic regression as a black-box model.

It is time now to apply LimeTextExplainer function to generate local explanations for predictions. The function needs as parameters the question to explain (of index 130609), the predicted label of the question generated from the black box model (the logistic regression), and the number of features used for explanation.

Generating explanations for one instance using LimeTextExplainer.

The result of the above code is the following:

Question: 
When will Quora stop so many utterly stupid questions being asked here, primarily by the unintelligent that insist on walking this earth?
Probability (Insincere) = 0.745825811972627
Probability (Sincere) = 0.254174188027373
True Class is: insincere

The classifier got this example right (it predicted insincere).
The explanation is presented below as a list of weighted features using the following instruction:

The result is:

[('stupid', 0.3704823331676872),
('earth', 0.11362862926025367),
('Quora', 0.10379246842323496),
('insist', 0.09548389743268501),
('primarily', -0.07151150302754253),
('questions', 0.07000885924524448),
('utterly', 0.040867838409334646),
('asked', -0.036054558321806804),
('unintelligent', 0.017247304068062203),
('walking', -0.004154838656529393)]

These weighted features are a linear model, which approximates the behavior of the logistic regression classifier in the vicinity of the test example. Roughly, if we remove ‘stupid’ and ‘earth’ from the question, the prediction should move towards the opposite class (Sincere) by about 0.48 (the sum of the weights for both features). Let’s see if this is the case.

The result is:

Original prediction: 0.745825811972627
Prediction after removing some features: 0.33715161522095155
Difference: -0.40867419675167543

As expected, the class is now sincere after removing the words, ‘earth’, and ‘stupid’ from the instance vocabulary.

The results can be shown in LIME in different types of visualization.

Notice that for each class, the words on the right side on the line are positive, and the words on the left side are negative. Thus, ‘stupid’ is positive for insincere, but negative for sincere.

You can also get a bar plot of the explanations using the code below:

Summary

LIME is able to explain the predictions of any type of classifier (SVM, neural nets, …) locally. In this post, I applied it to the Quora questions dataset to explain what makes a question insincere in Quora, but it can also be integrated into images and structured data classifiers. You can access to more codes and examples following this link.

If you have a question, feel free to comment below or ask it via email or Linkedin. And I’ll answer it.

The entire code is posted in my GITHUB profile in this link.

I will continue to post about XAI and other funny topics. Stay tuned!!

References

[1] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). “ Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144).

LIME code: https://github.com/marcotcr/lime

--

--

I write about data sciences, feminism, books, and my experience in academia