The world’s leading publication for data science, AI, and ML professionals.

Cardiovascular Risk – An approach to predict and explain it too

Abstract: In this article, you will learn what factors may contribute to cardiovascular risk, how they may be factored into a Deep…

A Machine Learning model may predict cardiovascular risk (Source: Photo by Robina Weermeijer on Unsplash)
A Machine Learning model may predict cardiovascular risk (Source: Photo by Robina Weermeijer on Unsplash)

Abstract: In this article, you will learn what factors may contribute to cardiovascular risk, how they may be factored into a Deep Learning model to predict risk, why it may make little sense in predicting a risk to health without offering any explanation, where does Explainable AI (XAI) step in, what kind of explanations may be given and what do they achieve.

The Objective

Consider a case in Healthcare where a Machine Learning (ML) system is asked to predict the risk of a patient developing Cardiovascular Disease in a given time frame, say within the next one year.

The output expected from the model is a risk score which if equal to or above a chosen threshold indicates that that patient has a risk of developing cardiovascular disease and if it were below the threshold, indicates no risk.

Model considerations

The patient data used for this purpose may be composed of various patient attributes like food habits, work habits, exercise routine, past medical conditions, hereditary markers, genetic markers, results of laboratory tests taken periodically etc.

When the available data is rich and spans many dimensions, typically Deep Learning (DL) models are called into play as they perform better and give a greater degree of prediction accuracy.

Though Deep Learning is most typically used where there is unstructured data i.e. classify images, detect objects, recognize speech, convert speech to text etc., it has found use when dealing with structured data i.e. medical data values, lab test results, physical attributes, signal read outs from devices etc. If you are interested to delve deeper into this aspect, I would highly recommend this post by Mark Ryan.

Beyond Model Performance

When the predicted results of the Deep Learning model are examined by a clinician, a very pertinent question may be raised – Why did the model take the decision that it did?

i.e. why did the model predict that a given patient, say A is at risk of developing heart disease while another patient, say B is not.

In this case, explanations similar to any of the following, made available to the clinician may help.

· Patient A has a sedentary lifestyle with minimal physical activity.

· Patient A has close family members that have had heart ailments and related conditions.

· Patient A has been under medication multiple times in the past 5 years and the lab tests have shown variations in blood composition.

A black box AI model that is opaque (Source: Photo by Brandable Box on Unsplash)
A black box AI model that is opaque (Source: Photo by Brandable Box on Unsplash)

This is where Deep Learning models encounter a problem i.e. there is no explanation that may be obtained from such a model by the human user. The lack of transparency in the model mechanics is why such models are called Black Box models. It would not be amiss to say that such Black Box models would find it hard to get accepted in Healthcare systems because of their opaqueness.

Explainable AI (XAI)

XAI aims to provide human understandable explanations for how an Artificial Intelligence (AI) system arrived at a decision. Providing explanation for decisions taken fosters trust in the human users of such systems. Trust is essential for such systems to be deployed for a wide range of use cases in domains like Healthcare, Banking, Insurance or even Military.

One approach is to choose a model like a Decision Tree or Logistic Regression whose outputs are easier to interpret by human users. Of course, the tradeoff may be loss in performance or the lack of generalizability because of the possibility of overfitting to the training data.

Here, we take an alternate approach of sticking to the Deep Learning model but developing methods to offer better explanations that what a baseline Deep Learning model approach may offer.

XAI explains the reasons behind model predictions (Source: Photo by Erda Estremera on Unsplash)
XAI explains the reasons behind model predictions (Source: Photo by Erda Estremera on Unsplash)

Building an explainable model

The data that represents patients who have been diagnosed with heart disease as well those who haven’t, spans multiple dimensions and therefore could be grouped into buckets that have some semantic meaning. These buckets along with the specific data dimensions that fall into each bucket may be –

Lifestyle factors – food habits, work habits, sleep quantity, exercise routines, smoking habits, alcohol intake, stress levels

Medical factors – conditions like hypertension or diabetes, results of lab tests, past medical history, known allergies, known sensitivities

Genealogical factors – heart ailments within the close family, hereditary factors, genetic markers etc.

In addition to the baseline DL model that takes all the available patient data across multiple dimensions as input, consider that there are Component DL models. A Component DL model is one that is fed patient data pertaining to one semantic bucket only.

In this use case, there will be three (3) Component DL models, one each catering to Lifestyle factors, Medical factors and Genealogical factors. Now, the results of all the Component DL models are compared for each prediction. The prediction is a score that indicates the risk of developing heart disease. This is typically expressed as a probability between 0 and 1.

Note: All the data dimensions listed above are not necessarily equally important from a data modeling perspective. For example, a smoking habit may be considered more important for determining risk of heart disease than say sleep quantity. The weights attached to signify the importance of each data dimension may be tuned at the time of modeling.

To illustrate this use case better, consider that a threshold score of 0.6 and above may be taken to mean that the patient is at high risk of developing heart disease and anything lower may be taken to mean that risk is minimal.

Risk score comparisons across all Deep Learning models (Source: youplusai.com)
Risk score comparisons across all Deep Learning models (Source: youplusai.com)

The table above offers an explanation for why a certain patient is deemed at risk of heart disease. With this at hand, the clinician may be able to explain the reason for risk to the patient and even recommend certain actions that may help reduce that risk.

A few recommendations and actions taken by the clinician may be as follows.

— More physical activity and adequate sleep for Patient A.

— Discussion on available treatments for Patient B considering that patient’s medical history.

— Detailed conversation with Patient C regarding the patient’s family history.

— Patient D not at risk overall but needs to watch his or her lifestyle factors more closely as they may bump up the risk if left unregulated.

A question to ponder

Will an explainable AI (XAI) system replace a clinician?

Absolutely not. An AI or an ML system is a tool in the clinician’s toolkit and does what it does best – crunch data to arrive at insights. This tool may save some valuable time for the clinician. That saved time may perhaps translate into a much better interaction between the patient and the clinician and foster the Doctor-Patient relationship.

With an active interest in Clinical Data Science, I will be exploring many more topics in this domain. If you’re interested in such endeavors, please join along my journey by following me on Twitter and subscribing to the youplusai YouTube channel.

Many thanks to Dr. T. R. Gopalan for his clinical inputs and review.

Originally published at youplusai.com.


Related Articles