🤖 Deep Learning

Grad-CAM: A Camera For Your Model’s Decision

Lights, CAM, Gradients!

Shubham Panchal
Towards Data Science
13 min readAug 15, 2021

--

Model Interpretability is one of the booming topics in ML because of its importance in understanding blackbox-ed Neural Networks and ML systems in general. They help identify potential biases in ML systems, which can lead to failures or unsatisfactory user experiences.

You’ll see these headlines every now and then, as the importance of bias-free ML models has been rising in the modern world. Model interpretability techniques actually allow us to examine our ML systems for inefficiency and look for potential biases as well.

In this story, we’ll study a new approach, the Grad-CAM technique to generate CAMs ( class activation maps ) which help us visualize what our CNNs ( or any other models ) look in the given data.

Also, you can find the source code for the Grad-CAM implementation ( using TF ) in the official Keras examples repo,

🧾 Contents

1. Intro To Model Interpretability

2. What can we use to interpret the model’s decision?

3. Enter Grad-CAM

4. Learning Through An Experiment- Grad-CAM

— 4.1. Understanding The Setup

— 4.2. Generating A Score For Each Feature Map

— 4.3. Generating the Grad-CAM heatmap

5. Counterfactual Explanations With Grad-CAM

6. Checking Bias In A Model’s Decision

7. Next Steps / More Stories from the Author

8. References

1. ⚖ Intro To Model Interpretability

Reverse-engineering the black box

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig. 1 -> Inference made by an image-classification CNN. Source: Image By Author

One thing you might have noticed while reading introductory books for deep learning, is that they term neural networks as *‘black boxes’. We know that neural networks do have a well-defined mathematical structure, so why are they referred to as ‘black boxes’?

The reason is that we can’t decode the decisions made by a NN. For instance, let’s consider a Dog vs. Cat classifier which is a simple CNN ( see figure 1 ). The decision made by the model is the class predicted by the model, for a given image. For a given image, the model makes the decision whether the image contains a ‘dog’ or a ‘cat’.

Decoding a decision, in our case, would help us investigate the traits or features of the input image which had major contribution in producing that decision. If our model predicts the class ‘dog’, then an important feature which might have produced this decision could be its nose, ears or any other features which distinguishes it from the class ‘cat’. We would like to know on what basis our model predicts a certain label, like a ‘cat’ or a ‘dog’.

“In order to build trust in intelligent systems and move towards their meaningful integration into our everyday lives, it is clear that we must build ‘transparent’ models that have the ability to explain why they predict what they predict.” — from [ 1 ]

Considering the orange vs. apple image classification problem, the shape of these fruits is not a good feature, as they might have similar shapes. This might give us *incorrect results, when the model has been deployed in real-world.

To better understand the concepts of model interpretability in ML, refer to,

In order to identify such features and also to discover the focus of our NN model for a particular decision, we introduce the concept of Model Interpretability ( See [2] ) . It consists of all those techniques which bring interpretability and transparency in decisions of our models.

*‘black boxes: In our context, a ‘black box’ refers to a model whose decisions can’t be decoded. In layman terms, we can’t satisfactorily answer the question, ‘Why was this decision made by our model?’.

*incorrect results: We might like to identify such features, so that we can know the weak-points of our model

2. 🎓 How do we interpret the model’s decision?

Gradients are all you need, especially in DL

To better interpret the decision of our model, we need to determine which feature in our inputs had the highest contribution in that decision. In our Dog vs. Cat classifier, if an image was predicted to contain a ‘cat’ then which features, like eyes, body color, ears, did the model consider to generate this prediction?

Well, we can analyze the gradients in order to check on which our model focuses for generating a certain prediction. Gradients help us measure the effect on the outputs ( as some function f on x ) caused due to the inputs ( x ). This is what we need, the part or the region in the image which has the highest effect on the NN’s output.

We’ll make sure of gradients to investigate regions of an image which are important for making a decision by the model. This is what makes the ‘Grad-’ in the Grad-CAM approach, which we’ll explore in the next section.

3. Enter Grad-CAM

Well, that’s what the story is all about!

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Source: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

Grad-CAM was introduced by researchers in 2017 in the paper ‘Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization’ ( see [ 1 ] ) so as to generate class activation maps ( CAMs ) for certain output of a model. Their approach, Grad-CAM, also generalizes the approach of [ 7 ] and can be used for other types of models as well. In their paper, one could observe visualizations for a image captioning model as well as a VQA ( Visual Question-Answering ) model.

First, we would understand the basic implementation of Grad-CAM right from the fundamentals of a NN. Hence, the story expects the reader to have some knowledge around NN, backpropagation and convolutional layers.

Then, we would highlight some of the features of Grad-CAM which help in the better understanding of our model. So, let’s start!

4. 🧪 Learning Through An Experiment — Grad-CAM

4.1. Understanding The Setup

Understanding the parameters which hold information regarding the model’s decision

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig. 2 -> Bottom layers of a CNN. Source: Image By Author

Assume that we have made a simple CNN for the cat vs. dog image classification problem. Our CNN will predict the distribution for the labels ‘CAT’ and ‘DOG’ for which we need to have a *softmax activation on the last Dense layer ( see figure 2 ). Our CNN consists of some convolution layers, followed by a flatten layer and then some FC ( Fully Connected or Dense ) layers. Instead of taking all the convolution layers into consideration, we only analyze the *outputs of the last convolution layer in our case. The outputs of the last convolution layer are flattened and passed to the FC layers. As observed in figure 2, the outputs of the last convolution layer are K feature maps, each of width W and height H. We represent them collectively as a tensor A where A_k would be the kth kernel.

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.3 -> Shapes of A and A_k. Source: Image By Author

To know the difference between a kernel and a filter, you may refer to this story of mine,

The outputs of our CNN, before the softmax, are y_c and in the case of cat vs. dog classifier, y_c is an array with shape ( 2 , 1 ) . The first and the second elements of this array are the outputs for the class ‘cat’ and ‘dog’ respectively.

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.4 -> Composition of y_c. Source: Image By Author

As we discussed earlier, we’ll make use of gradients to decode the model’s decision. For computing a gradient, we’ll require a function and variable with respect to which we’ll compute it. Our aim is to study the relationship between the feature maps, A, and the outputs of our CNN, y.

Each feature map captures some high-level feature of the input image and has its contribution in making the final decision y. Let’s suppose, that we need to decode the decision of predicting the class ‘cat’, so we’ll focus only on one of inputs y_cat. Any change in any of the features maps in A would result in the change in the value of y_cat. So, why not compute the gradient of y_cat with respect to one of the feature maps A_k?

First, let us understand the shape of this gradient, from the diagram below,

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.5 -> Shape of the gradient of y_cat w.r.t. to A_k. Source: Image By Author

So, our gradient of interest has the shape W * H which is the same as the feature maps in A. We will also use i and j to index the elements of this matrix. Another way we can construct this matrix is by computing the gradient of y_cat w.r.t. one of the elements of A_k, indexed by i and j. This gradient is a real number ( a scalar ),

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.6 -> Shape of the gradient of y_cat w.r.t. to the element indexed by i and j of A_k. Source: Image By Author

If we compute all such possible gradients, for all values of i and j and place them in a W * H grid, we get the same gradient as described in figure 5. Many of us might know this already, but its important to mention it here, as we’ll use this observation in performing global average pooling of the gradients.

*softmax: Precisely, we term it as soft argmax, but most ML literature resorts to softmax. See [ 3 ].

*outputs of the last convolution layer in our case: There’s a reason we chose the outputs of the last convolution layer. The last convolution layer captures high-level features and thus holds information regarding the important regions of the input image ( the image fed to the CNN ). A note from the Grad-CAM research paper [ 1 ],

“We find that Grad-CAM maps become progressively worse as we move to earlier convolutional layers as they have smaller receptive fields and only focus on less semantic local features.”

Note, we can always use other layers of the CNN. As the initial layers of a CNN capture local features, their gradients won’t explain anything about the global or high-level features which make up the final prediction ( or decision )

4.2. Generating A Score For Each Feature Map

Weighing each of the feature maps according to the influence they have in the final output

In the last section, we have obtained a gradient to study the change in y_cat ( or even y_dog ) w.r.t to a single feature map A_k. But our goal is to study the change in y_cat w.r.t to A, a tensor which consists of all K feature maps. Some feature maps in A might have a greater influence in the final output y_cat than others. It would be nice if we could assign a score to each of these feature maps, depending upon their influence in y_cat. As mentioned in the paper, we can compute the average all the elements of the gradient described in figure 6, and use it as a score for this feature map.

In other words, we are performing the global average pooling ( GAP ) of the feature map,

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.7 -> The score for each feature map. Source: Image By Author

Greater the score for a feature map, the more influence it has on y_c. Since we would like to analyze the features which increase the value of y_c, we would expect the feature map with a higher score to have more positive gradients summed up the expression in figure 7. A positive value for the gradient, as shown in figure 6, means that on increasing the value of the pixel ( element ) A_i,j, the value of y_c also increases.

4.3. Generating the Grad-CAM heatmap

Encapsulating all our knowledge to produce the heatmap

As we have computed scores for each of the feature maps in A, we can now proceed towards making the Grad-CAM heatmap, which we saw in an earlier section of the story. Later, we’ll superimpose this heatmap on the input image and see what our model saw to generate a certain prediction.

Using the scores, we compute the weighed-sum of all the K feature maps in A,

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.8 -> Computing the weighed sum of the feature maps. Source: Image By Author

Finally, we apply elementwise ReLU operation ( See [ 4 ] ) to obtain the final heatmap,

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.9 -> The Grad-CAM heatmap. Source: Image By Author

The reason for using a ReLU operation is clearly described in the paper as,

We apply a ReLU to the linear combination of maps because we are only interested in the features that have a positive influence on the class of interest, i.e. pixels whose intensity should be increased in order to increase y_c .

You can find the code to superimpose the heatmap onto the input image from the official Keras example on Grad-CAM.

5. 🎎 Counterfactual Explanations With Grad-CAM

Thinking what your model didn’t think!

As mentioned in ‘Interpretable Machine Learning’— by Christoph Molnar ( see [ 5 ] ),

A counterfactual explanation describes a causal situation in the form: “If X had not occurred, Y would not have occurred”

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.10 -> Counterfactual explanations as shown in the paper [ 1 ]. Source: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

A counterfactual explanation describes what would have not happened when a certain decision was taken. Considering our cat vs. dog classifier example, we would like to check what would have not happened, when the model’s decision was ‘cat’ or ‘dog’, given an input image.

In order to compute a counterfactual explanation for a given class and a model, we repeat the same procedure as described in section 4.

But this time, while performing global average pooling over the feature map A_k, we negate the gradient, like,

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.11 -> Expression for alpha_c for computing heatmaps as counterfactual explanations. Source: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

By negating the gradient, as shown in figure 11, we are now looking at pixels which cause an decrease in y_c when their value ( or intensity ) increases. The resulting heatmap captures the features at which the model didn’t focus while generating a prediction.

6. 👩🧑 Checking Bias In A Model’s Decision

Improving datasets and ML systems

GradCAM | TensorFlow | Keras | Machine Learning | Model Interpretability
Fig.12 -> Bias in a model’s predictions. Source: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

We can use Grad-CAM to identify bias in our model’s predictions which was passed by the training dataset. Considering the sample example as in the paper, suppose we’ve trained a nurse vs. doctor image classifier from images taken from the ImageNet dataset. The classifier would perform pretty well on the testing dataset but doesn’t generalize well. Well, how do we know this? Look at the Grad-CAM heatmaps in figure 12, our the biased model, we can observe that the model focuses on unwanted features ( part of the face, hair ) which seem insignificant while predicting labels ‘nurse’ and ‘doctor’. The model seems to learn gender-based features which aren’t appropriate in our use-case.

As mentioned in the paper, the dataset was biased. 78% of the images, labelled as ‘doctor’, contained males whereas 93% of the images, labelled as ‘nurse’, contained women. Clearly, the gender bias present in the dataset was passed on to the model. Note, the model performed well on the testing dataset and had 82% accuracy on it.

Once we know that a specific bias is present in our dataset, we recollect the data and retrain the model. The unbiased model, as seen in figure 12, captures the exact features to identify the image as containing a ‘nurse’ and ‘doctor’. Observe, that the model focuses on the stethoscope for identifying the image of a ‘doctor’.

7. 👨‍🦱 Next Steps / More Stories from the Author

Following the Grad-CAM, many research groups proposed their methods, for better model interpretability.

Grad-CAM++ [ 8 ] and Score-CAM [ 9 ] are built upon the idea of Grad-CAM. Further, we have Smoothed Score-CAM [ 10 ] as well.

Model interpretability has been an active research area as you’ll observe. So, we’ve made some new ‘-CAM’ just let me know in the comments below!

Thanks for reading the story! Feel free to express your comments/queries/suggestions in the comments section below.

8. References

  1. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
  2. Blogs: Fair and Explainable Machine Learning, Addressing the Issue of “Black Box” in Machine Learning
  3. Wikipedia: Softmax function
  4. Understanding Deep Neural Networks with Rectified Linear Units
  5. Interpretable Machine Learning — A Guide for Making Black Box Models Explainable., by Christoph Molnar
  6. Bias in dataset: Deep Learning for COVID-19 detection | Towards Data Science
  7. Learning Deep Features for Discriminative Localization
  8. Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks
  9. Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks
  10. SS-CAM: Smoothed Score-CAM for Sharper Visual Feature Localization

--

--