Should AI explain itself? or should we design Explainable AI so that it doesn’t have to

Prajwal Paudyal
9 min readMar 4, 2019
The question of interpretability

In this article, I’ll go over:

  1. What is Explainable AI (XAI) and why do we need it
  2. Some viewpoints on why XAI won’t work, or isn’t needed.
  3. The performance vs. explainability conundrum. Can we have best of both worlds?

TL;DR

This article got longer that what I originally intended, so for the busy souls, here is a synopsis.

Explanations for AI behavior that are generated Ad-hoc or post-hoc are more like justifications and may not be capture the truth of the decision process. If trust and accountability is needed, that has to be taken into account early on in the design process. Explainable AI (XAI )is NOT an AI that can explain itself, it is a design decision by developers. It is AI that is transparent enough so that the explanations that are needed are part of the design process.

Now, the full story.

Why XAI?

A self driving car knocked down and killed a pedestrian in Tempe, AZ in 2018. Issues like who is to blame (accountability), who to prevent this (safety) and whether to ban self-driving cars (liability and policy evaluation) all require AI models used to make those decisions to be interpretable.

This research paper shows that a classifier that could recognize wolves from husky dogs was basing its decision solely on the presence of snow in the background.

This article on Propublica show hows a predictive model for risk of repeat criminal offense can have a strong ethnic bias. Similar ethnic and gender bias has been exposed in candidate-screening algorithms for hiring, loan approval applications among others.

So, if we measure the aptitude of an AI system solely based on one performance metric (i.e.) accuracy, we might be in for a surprise later. (Article: Do AI take shortcuts?)

Adversarial examples

Researchers able to selectively distort images to make the classifier think they are ostriches

Then, there are adversarial examples where some attackers can selectively add distortions to images to make an otherwise robust classifier always choose a certain class. These distorted images look normal to humans and thus undetectable. This has big implications for security algorithms.

As AI systems are involved in more and more decisions we make, the want and need to make AI explainable will only grow, however the decisions from most of the AI systems we use now, are not explainable.

What kind of explanations do we need?

Now that we have established why we need XAI, let’s talk about what kind of explanations are desirable.

The first type are explanations that those that help to understand the data better. An example would be a system that tell you that a particular picture is a cat because it looks like another example of a cat (nearest neighbor). Another XAI can tell you its a cat because it has whiskers and fur (feature).

Explanations on what features caused a decision [2]

The second type of explanations are the ones that give understand the model better. Approaches to visualize the various neuron activations in neural networks are primarily in this category.

XAI explanations will usually cover both of these areas, but will usually have a stronger inclination.

Why no XAI

Not all applications require explanations. So the first step would be to determine if explanations are needed. If the application in mind has no critical behavior that can give rise to some liability then a blackbox model will work just fine. However, as we discussed above, there are many applications that absolutely need XAI.

Now, to make sure our understanding is well-rounded let’s go over some of the criticisms for XAI

1. Too complex to explain

G-Flops vs accuracy for various models

Skeptics point out (and correctly so) that most popular AI models with good performance have around 100 million parameters. This means there were 100 million numbers that were learned during training that contribute to a decision. With that complexity, how can we begin to think about which factors affect the explanations?

2. Performance vs. Explainability Tradeoff

Machine learning in classification works by: 1) transforming the input feature space into a different representation (feature engineering) and 2) searching for a decision boundary to separate the classes in that representation space. (optimization). Modern deep learning approaches perform 1 and 2 jointly by via. a hierarchical representation learning.

Possible tradeoff with performance [2]

Simple models like linear regression are also inherently explainable since the weights and signs on them are indicators of importance of features and decisions can be explained in terms of those weights.

Decision trees also are explainable, because the various attributes are used to form splits. These splitting decisions can be used as a basis for mining rules that can suffice to some extent as explanations. (However there are issues with ambiguities)

A deep neural network(DNN) that learns millions of parameters and may be regularized by techniques like batch normalization and dropouts is quite incomprehensible. Many other Machine Learning techniques also face this problem.

However with DNNs gaining state-of-art performances in many domains and with a big margin, it is hard to make a case against using them.

3. Humans cannot explain their decisions either

If we adhere to humans as the golden standard, then the argument is that we should either distrust all humans, or we should be wholly content with our unexplainable AI partners.

There is evidence that most of the thinking and decision making mechanism in humans happens unconsciously. There is also evidence that we have evolved to invent explanations, when in truth we are at times clueless of the decision process. [1] We also generate naive explanations for things we do for reasons we don’t want to admit, if only to avoid cognitive dissonance.

In the face of all this, how do we expect AI to be able to explain itself?

XAI isn’t AI that explains itself, it is rather AI that doesn’t need to

There are ad-hoc and post-hoc XAI systems suggested that use two separate models: a learning model to make the decisions and an explanation model to provide explanations.

This is perhaps similar to our human neurology, and these explanations are more like justifications given by the PR team for a bad executive decision.

If the explanation model knew what was going on AND could explain it, why not use it instead of the decision model in the first place?

To want a separate AI that can explain an AI might not be the way to go, specially for reasons of accountability, trust or liability. This is any time a higher order decision has to be made by taking the explanations into consideration.

So, the ad-hoc and post-hoc explanations can be used to better understand the modeling process, or research into how AI works behind the scenes, but usually is not enough to support business level decisions.

Let us recap what we covered so far.

  1. Humans use complex and incomprehensible decision processes as well. These may not be explainable themselves
  2. Perhaps blackbox models cannot be avoided because the problems we want to solve are complex and non-linear. If simpler models were used, there will be a trade-off in performance (for explainability) which we don’t want.

So where do we go from here?

Solution: Think about interpretability early, early on

We believe that we’re seeing the world just fine until it’s called to our attention that we’re not. Then we have to go through a process of learning — David Eagleman

Think of the explanations needed EARLY on.

If the application being design requires explanations, then the expected explanations have to be brought to the table much earlier on- during finalizing the requirements, so the architectural design can incorporate it.

Rather than making the very tough decision whether to use simpler but explainable models with low performance or to use complex black-box models with no explanations, the better option might be to

  1. Know what explanations are desired (probably with a consultation with the parties that require the explanations)
  2. Design the architecture of the learning method to give intermediate results that pertain to these explanations.

Now, easier said than done. Knowing what explanations are required of an application will need close collaboration of all stake-holders. For instance a loan approving AI application will need explanations that are radically different than a face identification algorithm. In other applications, learning to make decisions by first learning high level attributes may still be impossible.

To make this understanding more concrete, let us look at a practical example of an AI application from this research that teaches people sign language words. If the student executes the sign correctly, everything is fine. But if the student hasn’t quite gotten it yet, like any good tutor, the AI application is expected to give some feedback to the students. This AI lives in a computer and can use the cameras to look a learner.

The feedback as explanations suggested by [3]

Now, if black-box Computer Vision (CV) algorithms are utilized, the AI can perhaps determine if a student did a sign incorrectly, but what feedback can it provide? It can only say ‘try again’, perhaps with some tonal adjustments.

However, let us consider another AI that is designed with the possible explanations in mind.

Sign Language Linguists have postulated that they way signs differ from each other either in the location of signing, the movement or the hand-shape. (In other words, you can mess up in one of these three ways) *.

‘Phonetics’ of a Sign

Now, with that in mind, separate AI models can be trained to detect the correctness for each of these attributes. Then the final decision can be a combination of the decisions of these three models.

So, when a new learner makes a mistake in one of these, an appropriate feedback like ‘your hand-shape wasn’t correct’ can be given with examples of the correct one.

The key insight here is that, recognizing the shape of the hand, or if it matches with another hand-shape is a problem perhaps best solved by a CNN which itself doesn’t need to be explained. However, the bigger problem of recognizing a sign is broken down in terms of sub-problems which forms the basis for explanations and this knowledge is domain specific (i.e. this breakdown will only work for Sign Languages)

Open Questions

Can all problems that need explanations be broken down like this? If so, can we develop a theoretical framework for doing so? Can we quantify the explanation-performance trade-off and minimize it? How granular should the explanations be? What are best ways to provide explanations?

These questions will have to be answered in collaboration with the field experts where the AI algorithms are designed. One thing is for sure, going forward, as people and governments demand more explanation (GDPR 2018), the ‘one model fits all problems’ approach to AI will not be the answer.

Comments? Suggestions?

Here are some of my other posts:

References

  1. Incognito: The secret lives of the brain
  2. DARPA document on XAI
  3. Learn2Sign

Annotations

*Facial expressions isn’t considered in this work, and orientation is learnt jointly with hand shape.

--

--