Opinion
Explainable AI is not a theory anymore!!
Written by Chandan Durgia and Prasun Biswas
If one ranks economic theories by the variety of applications, "Game Theory" surely would be in the top one percentile. Like others, Game theory has evolved in its interpretation and applications. Starting with a basic postulate of "how agents would choose between different options considering the competitor’s view" to the Nobel prize-winning ones like "Nash equilibrium" (John Nash – 1994) and "Stable allocations and practice of market design" (Shapley and Alvim Roth – 2012).
From areas like politics, gaming, marketing, philosophy etc. game theory has been instrumental in defining how people make better decisions given the interaction with other cooperative/competitive players.
Before we delve further, let’s understand the problem which Shapley Et al. (called Shapley henceforth) tried to solve using Game theory. In simple words, Shapley devised a unique solution (termed as "theorem" or "Shapley theorem" henceforth) to answer a question "given a ‘work’ what the share of rewards/costs are when the associated people have different or similar objectives". In other words, what is the marginal contribution of the rewards/costs to the players of a game with certain objectives.
Some interesting problem statements which could leverage the theorem are:
- Airport problem: How much each agent should pay for the common runway?
- Taxi Fare problem: In case of a shared ride what exactly should be the correct contribution of each passenger?
- What is the most optimum way to allocate the total team’s bonus to the team members? 🙂 (If a free-luncher eats your credit, you so badly want this.)
The use cases for the theorem are vast.
However, what had been fascinating is how this has left an indelible mark even in the space of Artificial Intelligence/Machine learning (AI/ML). From modelling perspective, the Shapley values help in attributing the "contribution" of each of the features and the "directionality" of the feature impacting the dependent variable.
For a given value model, Shapley values attributes the reward/cost among the agents per the following equation:

Conceptually, Shapley is quite intuitive. For a given model, the marginal contribution of a feature is calculated by:
· analyzing the difference in the output by including and excluding the feature
· averaging over all N! possible orderings
· with all subset of remaining features.
All these three components are highlighted in the formula above.
Additionally, these Shapley values must have a couple of key properties:
- Additive: The Shapley values should sum up to 100%. i.e. the contribution is divided amongst all players in a fashion that the sum of total marginal contribution is 100%.
2. Monotonicity: If a particular feature has higher contribution then it should always have higher Shapley value/reward.
From a Python perspective, the "shap" package is used to calculate the Shapley values. Furthermore, it provides 3 key plots:
- Summary Plot: This plot helps to understand
a. Relative importance of each feature: For example, in the summary plot below, Feature 5 is the most important, followed by other features in the decreasing order.
b. Directionality: Whether the high/low value of the feature, on an average, increases/decreases the model output. For example, for Feature 1 in the plot below, the blue/low values would generally mean negative SHAP value of model output and vice versa for the red/high values.

- Dependency Plot: This provides a view into the kind of relationship between the SHAP and the value of the variable i.e. whether the relationship is monotonic, U-shape etc. In addition to this, it also provides the variable which is the most correlated with the given feature. In the plot below Feature 15 has the monotonic relation and Feature 1 is strongly correlated with Feature 15.

- Force Plot: This is very similar to the commonly used waterfall diagram wherein the contribution of each of the features is highlighted which led to change in SHAP value from its base value to the output value.

Having covered SHAP in decent detail, let’s now address the elephant in the room.
Could it leave an impact around the Risk analytics vertical?
Credit Risk Analytics (Regulatory), as a domain, has been pretty constrained around the analytics methodologies use. The primary reason being the "explainability of models" mandated by the financial risk regulations. The idea behind these regulations has been to ensure that the models and the features are clearly understood by the management and no risk drivers are missed from the model features. Understandably, this has been a big impediment to the usage of core AI/ML (black-box) models. If we were to break down, the explainability here would mean:
- Understanding the features/variables in the model.
- Understanding the "directionality" of the model variables.
- Understanding the "contribution" of each of the model variables.
- Knowing the parameters of the variables (including the activation function). This basically means knowing the equation.
For a long period, the researchers could just get the names of the features in the AI/ML models (#1 – list above) but what’s happening with the features inside these black-box models remained veiled. Though Shapley’s theorem has been there for quite some time, it is only in the last few years wherein Shapley theorem has been used to understand the "directionality" (#2 – list above) and "contribution" (#3 – list above) of each of the features. Given this, the only part left is not knowing the equation (#4 – list above).
Though the explainability problem is still not completely solved, from credit risk analytics perspective, leveraging this Shapley theorem and derived Shapley Values could open new avenues:
1. Reliable challenger models (Regulatory Models): Though some banks **** have been using the AI/ML sophisticated models as challenger models. However, both the modelers and the management couldn’t sufficiently rely on these models. This is primarily because there has been no surety, given the portfolio, whether, in the model, the features are given right weightages and the signage/directionality of the features are intuitive or not. With the introduction of Shapley Values now the modelers and management could rely more on the models knowing that the right features are given appropriate importance and the directionality of the features are also correct.
2. Mining the right features (Regulatory Models): There are a lot of credit portfolios where feature engineering, to get a reliable model, is difficult and time consuming. In cases like these, modelers can now leverage the AI/ML complex models together with Shapley Values to understand the main drivers of the dependent variable. Once the modeler has better understanding of the main features, their contribution level and their directionality, the same features can be used in non-AI/ML methodology to get more meaningful, reliable and sensitive models.
- Impact on Non-Regulatory models: Another area in credit risk analytics, where AI/ML is finding its space is around application/behavioral/collection scorecards. Though the financial industry had been using the AI/ML models for these cases for quite some time, however, key concerns around explanability has remained. For example, around application scorecard if the loan is rejected – on one end the customer wants to know on what basis the loan is rejected and on the other hand management wants to understand the financial risk around rejection of customers. Since Shapley Values helps to understand the contribution and directionality of each feature – it has become much easier for the modelers to rationalize the decision produced by the model.
Albeit, there is still more work required to make these black box models completely explainable per regulator’s expectations; however, Shapley values have surely helped in making significant strides in improving the acceptability of the AI/ML models in risk analytics space.
John Nash once said "I can observe the game theory is applied very much in economics. Generally, it would be wise to get into the mathematics as much as seems reasonable because the economists who use more mathematics are somehow more respected than those who use less. That’s the trend." We as risk analytics practitioners, are glad and grateful that the theory found its way to mathematics and consequently created attribute like Shapley values which has potential to create a paradigm shift in the risk analytics space.
There are still miles to cover but it doesn’t hurt to be hopeful!!
Disclaimer: The views expressed in this article are opinions of the authors in their personal capacity and not of their respective employers.