An attempt- Detection of COVID-19 presence from Chest X-ray scans using CNN & Class Activation Maps

Souradip Chakraborty
Towards Data Science
10 min readApr 2, 2020

--

Author: Souradip Chakraborty

Fig 1: Corona Virus Disease 2019, Source

Coronavirus disease 2019 (COVID-19) is a highly infectious disease caused by severe acute respiratory syndrome coronavirus 2. The disease first originated in December 2019 from Wuhan, China and since then it has spread globally across the world affecting more than 200 countries. The impact is such that the World Health Organization(WHO) has declared the ongoing pandemic of COVID-19 a Public Health Emergency of International Concern

As of 1st April 2020, there are a total of 873,767 confirmed cases with 645,708 active cases and 43,288 deaths in more than 200 countries across the globe (Source: Wikipedia).

Fig 2: Corona Virus(COVID-19) Map. Source

The governments are working hard to close borders, implement contact tracing, identifying & admitting the affected ones, isolating the probable cases but the count of individuals being affected by the virus are increasing exponentially in a majority of the countries and is unfortunately expected to increase until a medicine/vaccine can be developed and applied after a significant amount of clinical trials.

Fig 3a: Rate of Spread of Coronavirus across several countries. Source

Though research suggests that social distancing can significantly reduce the spread and flatten the curve as shown in Fig. 3a, but is that sustainable? Well, I leave the answer to you all.

Fig 3b: Effect of Social Distancing on the spread of Corona Virus. Source

So, in this particular scenario, one primary thing that needs to be done and has already started in the majority of the countries is Multiple testing, so that the true situation can be understood and appropriate decisions can be taken.

But we can understand that these tests are very critical and should be done with absolute precision which would definitely need time. This can be highly dangerous since if the infected ones are not isolated before time, they can infect others which might lead to an exponential increase as in Fig. 3b. Especially in countries like India, where the population density is exceptionally high, this can be a reason for devastation.

The standard COVID-19 tests are called PCR (Polymerase chain reaction) tests which look for the existence of antibodies of a given infection. But there are a few issues with the test. Pathogenic laboratory testing is the diagnostic gold standard but it is time-consuming with significant false-negative results as mentioned in this paper.

Moreover, large scale implementation of the COVID-19 tests which are extremely expensive cannot be afforded by many of the developing & underdeveloped countries hence if we can have some parallel diagnosis/testing procedures using Artificial Intelligence & Machine Learning and leveraging the historical data, it will be extremely helpful. This can also help in the process to select the ones to be tested primarily.

Fast and accurate diagnostic methods are urgently needed to combat the disease. In a very recent paper ‘A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19)’ published by Shuai Wang et. al they have used Deep Learning in extracting COVID-19’s graphical features from Computerized Tomography (CT) scans (images) in order to provide a clinical diagnosis ahead of the pathogenic test, thus saving critical time for disease control.

The study used transfer learning with an Inception Convolutional Neural Network (CNN) on 1,119 CT scans. The internal and external validation accuracy of the model was recorded at 89.5% and 79.3%, respectively.

In my experiment, I have performed a similar analysis but on Chest X-ray images and the major reason is that getting CXRs is more accessible for people than getting CT scans especially in rural and isolated areas. There will also be more potential data available.

Now let’s come to the dataset that has been used by me. So, Dr.Joseph Paul Cohen (Postdoctoral Fellow at the University of Montreal), recently open-sourced a database containing chest X-ray images of patients suffering from the COVID-19 disease. The dataset used is an open-source dataset which consists of COVID-19 images from publicly available research, as well as lung images with different pneumonia-causing diseases such as SARS, Streptococcus, and Pneumocystis.

So, the dataset consists of COVID-19 X-ray scan images and also the angle when the scan is taken. It turns out that the most frequently used view is the Posteroanterior view and I have considered the COVID-19 PA view X-ray scans for my analysis.

Now, I have also used the Kaggle’s Chest X-ray competitions dataset to extract X-rays of healthy patients and patients having pneumonia and have sampled 100 images of each class to have a balance with the COVID-19 available image. (Though I will work on this part and improve the approach).

Convolutional Neural Network Approach to detect the presence of COVID-19 from X-rays :

Let’s have a glance at the class-wise distribution of the dataset.

Fig 4: Classwise Distribution of the Dataset

So, in my approach, I have run the Convolution Neural Networks on three classification problems

  1. Classifying the normal vs COVID-19 cases. [2 class problem]
  2. Classifying pneumonia vs COVID-19 cases. [2 class problem]
  3. Classifying normal vs COVID-19 vs pneumonia cases.[3 class problem]

I have seen in some analysis, people have combined the normal and pneumonia cases which I don’t find appropriate as the model will then try to ignore the between-group variance amongst those two classes and the accuracy thus obtained won’t be a true measure.

Fig 5: Combining two different classes as one class can be misleading. Source

So, this is a simple illustration of my above-made hypothesis (just for explaining). Let’s say ‘feature1’ and ‘feature2’ represent the latent space, where the CNNs project the images into and the images belonging to each of the three classes has been labelled in the image.

It can be seen that they are currently linearly separable but if we combine the classes ‘Normal’ and ‘Pneumonia’ as one single class, the separability vanishes and results can be misleading. So, if we are combining classes, certain validations need to be done.

Though one might say the projection will take care of that but that won’t hold good since we are using Transfer Learning.

Anyway, in my analysis, the main point is to reduce both false positives and false negatives. Let’s move to our analysis. I have used transfer learning with the VGG-16 model and have fine-tuned the last few layers.

vgg_pretrained_model = VGG16(weights="imagenet", 
include_top= False,
input_tensor=Input(shape=(224, 224,3)))
new_model = vgg_pretrained_model.output
new_model = AveragePooling2D(pool_size=(4, 4))(new_model)
new_model = Flatten(name="flatten")(new_model)
new_model = Dense(64, activation="relu")(new_model)
new_model = Dropout(0.4)(new_model)
new_model = Dense(2, activation="softmax")(new_model)
model = Model(inputs=vgg_pretrained_model.input, outputs=new_model)

The final number of parameters of our model is shown below. The model has been trained using Kaggle GPU.

Total params: 14,747,650
Trainable params: 2,392,770
Non-trainable params: 12,354,880

Case 1: Normal vs COVID-19 classification results

Fig 6: Classification report of the COVID-19 vs Normal model

As you can see clearly, that the model can almost with a 100% accuracy precision and recall distinguish between the two cases. Now to have more understanding, I have used the concepts of gradient-based class activation maps in order to find which are the most important section of the image that is helping the model to classify with such accuracy.

Now to understand more about how gradient-based class activation maps (GRAD-CAM) works, please refer to the paper. I probably will go through them in detail in one of my future blogs.

Fig 7: Gradient-Based Class Activation Maps

The code for plotting the Grad-CAM heatmaps have been given below. I have done a few modifications in order to have a better view.

def get_class_activation_map(ind,path,files) :

img_path = path + files[ind]
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (224, 224))
img = np.expand_dims(img,axis=0)

predict = model.predict(img)
target_class = np.argmax(predict[0])
last_conv = model.get_layer('block5_conv3')
grads =K.gradients(model.output[:,target_class],last_conv.output)[0]
pooled_grads = K.mean(grads,axis=(0,1,2))
iterate = K.function([model.input],[pooled_grads,last_conv.output[0]])
pooled_grads_value,conv_layer_output = iterate([img])

for i in range(512):
conv_layer_output[:,:,i] *= pooled_grads_value[i]

heatmap = np.mean(conv_layer_output,axis=-1)

for x in range(heatmap.shape[0]):
for y in range(heatmap.shape[1]):
heatmap[x,y] = np.max(heatmap[x,y],0)
heatmap = np.maximum(heatmap,0)
heatmap /= np.max(heatmap)
plt.imshow(heatmap)
img_gray = cv2.cvtColor(img[0], cv2.COLOR_BGR2GRAY)
upsample = cv2.resize(heatmap, (224,224))

output_path_gradcam = '/kaggle/working/' + files[ind] + 'gradcam.jpeg'
plt.imsave(output_path_gradcam,upsample * img_gray)

Class activation Map outputs for Normal patients :

Fig 8: Grad-CAM heatmap for Normal Patients

So, we can see that the model focusses more on that highlighted section to identify and classify them as normal/healthy patients.

Class activation Map outputs for COVID-19 patients :

Fig 9: Grad-CAM heatmap for COVID-19 Patients

Similarly, the highlighted part is towards the right-end section of the image which indicates that possible that section is an important feature in determining if the patients have COVID-19 or it can be that COVID-19 has affected the patient in section. This can be validated with the clinical notes.

Case 2: Pneumonia vs COVID-19 classification results

Fig 10: Classification report of the COVID-19 vs Pneumonia model

Class activation Map outputs for patients with Pneumonia:

Fig 11: Grad-CAM heatmap for Patients with Pneumonia

Case 3: Pneumonia vs COVID-19 vs Normal classification results

Fig 12: Classification report of the COVID-19 vs Pneumonia vs Normal model

In all three cases, the model has performed significantly well even with this small dataset. Moreover, the purpose of building three different models was also to check the model consistency with respect to the detection of the COVID-19 cases. In all three cases, both the precision and recall have been significantly high for COVID-19 cases in test data.

** Having said so, this is merely an experiment done on a few images and has not been validated/checked by external health organizations or doctors. No clinical studies have been performed based on the approach which can validate it. This model has been done as a Proof of Concept and nothing can be concluded/inferred from this result. **

But, there is a huge potential to this approach and can be an excellent method to have an efficient, fast, diagnosis system which is the need of the hour. The major advantages have been listed below :

Advantages :

The advantages have been referred to from this source.

  1. Shipping the test or the sample is one of the shortcomings of PCR tests whereas X-ray machines can solve the problem.
  2. If a situation comes when the radiologists & doctors get affected, AI can generate preliminary diagnosis to understand if a patient is affected/not.

Conclusion:

So, to conclude I want to re-iterate myself in mentioning that the analysis has been done on a limited dataset and the results are preliminary and nothing conclusive can be inferred from the same. Clinical trials/medical validations have not been done on the approach.

I plan to increase the robustness of my model with more X-ray scans so that the model is generalizable. Moreover, the number of COVID-cases will be less (though it is increasing exponentially) in number compared to the number of healthy people so there will be a class imbalance on that. I want to improve my sampling techniques and build a model that can handle the class imbalance for which I will need more data.

Also, the current approach is based on fine-tuning the ImageNet weights, but if we can build a model specifically for this purpose, results will be much more trustworthy and generalizable.

So, if you have X-ray scan images of COVID-19 affected patients that are acceptable to the repository, please contribute to the repository as it will be beneficial at these crucial times.

A piece of good news is that MIT has released a database containing X-ray images of COVID-19 affected patients. So, as a next step, I will try to incorporate that data into my modeling approach and check the results.

Moreover, I will be working on the Class Activation Map outputs based on the gradient values and validate the same with the clinical notes.

Note — I am not from the medical field/biological background and the experiments have been done as a Proof of concept.

Note from the editors: Towards Data Science is a Medium publication primarily based on the study of data science and machine learning. We are not health professionals or epidemiologists, and the opinions of this article should not be interpreted as professional advice. To learn more about the coronavirus pandemic, you can click here.

Stay Safe and Happy reading :)

References :

  1. https://github.com/ieee8023/covid-chestxray-dataset
  2. A brilliant blog on similar lines by Ayrton San Joaquin: https://towardsdatascience.com/using-deep-learning-to-detect-ncov-19-from-x-ray-images-1a89701d1acd
  3. https://github.com/HarshCasper/Brihaspati/blob/master/COVID-19/COVID19-XRay.ipynb
  4. The paper ‘Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization’.
  5. https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
  6. https://www.kaggle.com/michtyson/covid-19-xray-dl#1.-Data-Preparation

--

--

Statistical Analyst @WalmartLabs. Masters in Data Science from Indian Statistical Institute. Youngest Speaker@Data Hack Summit Analytics Vidhya’2018