The world’s leading publication for data science, AI, and ML professionals.

Why You Should Be Talking About Explainable Machine Learning

As our reliance on machine learning grows, so must our understanding of it.

Photo by Alina Grubnyak on Unsplash
Photo by Alina Grubnyak on Unsplash

So What’s the Problem?

Short answer: Most machine learning (ML) models are not very well understood, and as economies are growing more reliant on ML outcomes, the risk of poor predictions and systemic discrimination in AI-driven technologies is growing.

Long answer:

It’s no secret that nearly all medium and large companies’ make their decisions off the back of Data. Most companies utilise AI to some extent either directly (through internal development) or indirectly (using 3rd party software). We’re at the point now where "AI" is no longer a buzz word – its use, by most companies, is simply assumed.

At the root of all AI, however, lies machine learning. Traditionally, machine learning models are "black boxes" where data feeds in, and predictions come out without developers understanding how or why. The accuracy of a model was sufficient enough to know that it could be trusted.

"With ML now built into most applications, the risk of data-driven systemic discrimination and unfair predictive practices is at an all-time high."

Consider a home loan that is rejected based on one’s ethnicity, gender, religion or sexual orientation. Such a scenario, while unfair and absurd, is not uncommon. Similar examples are prevalent across nearly every industry – imagine the implications of this discrimination in sectors like healthcare, banking and insurance! Thankfully, such cases are generally not caused by malicious intent, but rather by biased data that was missed by the developing data scientists. With the right technology and the correct application, we can say goodbye to discriminatory technology and welcome an ML-driven world that is fair and free of bias.

Enter Explainable Machine Learning

As the world’s reliance on ML grows ever deeper, our understanding of how and why predictions are being generated at a model-by-model basis must mature.

Explainable Machine Learning (EML) is the next generation of machine learning that delivers explainability and interpretability to ML models. It is the technology that will massively reduce the risk of biased and discriminatory machine learning and help organisations and economies make better decisions and operate more efficiently.

So what do I mean by explainability and interpretability? The two terms are often (and mistakenly) used interchangeably, but in ML evaluation, there are several key differences.

Explainability

Explainability answers the WHAT and WHY behind the inner mechanisms of a model in a language understood by humans, enabling several valuable benefits:

  • The understanding of the model’s function by non-technical domain experts
  • The discovery of prediction errors
  • The identification of previously unseen concepts (what unobserved events could have occurred given the observed events)
  • A better understanding of when uncertainty, bias, or unfairness is present

Interpretability

Interpretability answers HOW a model arrives at a given output without having to understand its inner-mechanisms, helping to uncover the answer to crucial questions that are often missed by Data Scientists:

  • If you change specific input parameters, does it lead to the same, better or worse outcome?
  • Will the event still occur if there is a change in the situation?

"Explainable machine learning effectively acts as a translator that allows its user to understand, and even change, outcomes."

If you’ve ever worked as or with a Data Scientist you would have almost certainly contemplated the question "why did the model make this prediction?" or "what can we do to change this predicted outcome?". In my own experience, being able to answer these questions clearly and confidently is just as important – if not more critical – than fine-tuning your model’s hyperparameters. Why? There are several reasons.

What EML Brings To The Table

1. Domain Expertise in Model Evaluation

Excellent models consider a combination of statistical skills, computer science expertise and domain knowledge specific to the problem at hand when they’re in development (you’ve all seen the classic Venn diagram). Effective models should also, therefore, use these same three domains when being evaluated. EML invites less-technical team members into the conversation of model evaluation and interpretation – something that has been left purely to statistics until now. With these additional brains added to the equation, the lense of domain expertise is introduced to the ML evaluation toolkit.

2. Influencing Outcomes

If a prediction is unfavourable, how are you meant to know how to change the outcome if you don’t understand WHY is it unfavourable? Say you’ve developed a near-perfect model that can predict customer churn (whether your customer will leave or not) at a 98% accuracy. You deliver your predictions to the accounts team who notice that your highest value customer is almost certainly going to leave if you don’t take any actions. What use is this if you don’t know WHY they are likely to leave? Shouldn’t you be able to know which variables are leading to this prediction? Is it something you can control?

"EML not only sheds light on why we can expect certain outcomes, but also allows us to understand how those outcomes can be influenced."

3. Executive Buy-In

Why should a decision-maker place trust in your predictions if you can’t easily explain or justify them? An executive with 20 years of industry experience is unlikely going to accept "the model scored 90% accuracy on the validation data so you can trust it" as an answer when they question a prediction that goes against their intuition. Being able to visually justify and explain WHY a model arrived at a particular outcome in their language is the key to executive buy-in.

4. The Rise of the Citizen Data Scientist

Coupled with Automated Machine Learning (AutoML), EML is allowing less-technical people to conduct day-to-day machine learning tasks that are accurate and free from bias and discrimination. By removing the need for highly-technical Data Scientists for everyday machine learning tasks, EML is driving a new era of the "Citizen Data Scientist", and is allowing technical experts to focus on more complex problems, actions and outcomes.

The Next Generation

Traditionally Data Science projects have been heavily centred around gathering the right data, cleaning and engineering the right features, and selecting and tuning the right models. The next generation of fair, automated and explainable machine learning is just beginning.

EML is empowering the citizen data scientist by connecting less-technical workers with complex ML models and shifting the data labour market from those that can build ML models to those who can understand them, interpret them, and take action. With it, we can create an ML-driven world that is fair and free from bias and discrimination.


Related Articles