Not only is offering AI transparency and explainability the most responsible thing to do, but it can also help us find ways to improve our data-driven products and offer values to users.

With large-scale machine learning systems deployed in vulnerable spaces around the world, black-box models are becoming increasingly dangerous without checks and balances.
Take, for example, a healthcare system where unexplainable AI models are powering classification for chronic illness or regression for certain risk factors. There are real impacts on human lives with each prediction and our typical difference in "training vs production" accuracy now carries even more weight.
More cynically, some of these errors due to production data distribution change may not surface until years later (such as the case with chronic illness prediction), which makes "preventing data drift damage" equally as important as "detecting data drifts" once they happen. For this reason, we want to predict drift damage and create backup models that specifically target our current model weaknesses. But before that, we need a way to first explain and understand our model, so we can understand our model’s weakness.
In this blog, we will explore an alternative way to protecting models from unsuspecting concepts and feature drift: predicting the malignancy of potential data drifts using a regression for regression models.
The idea of creating a general model for models is inspired by recent publishings in meta-learning, but we did not use traditional gradient methods, but rather are interested in generating our own features later.
1 Problem Statement
In this blog, we will refer to model performance as malignance, since not all data distribution shifts cause significant performance decrease, and not all evaluation metrics are the same for each model (as we will see in the next section).

To simply our problem statement, we divided up the problems we might face in the above Figure 1. We are creating a Predictor Model that takes in Features generated from data distribution summaries and model summaries from an arbitrary Base Model.
In the vertical axis, we differentiate Predictor Models since regression models can be used to directly predict how severe the malignance is, while classification models are used to predict the most vulnerable features (based on quantile). The goals are clearly different, since in one case we want to know exactly how much damage data drift can cause, while we are simply looking for vulnerable features.
In the horizontal axis, we separated the type of models we are trying to evaluate drifts on. The reason why we separated we separated regression problems from classification problems is fairly simple. The metrics (i.e loss functions) for evaluating our Base Model’s degradation are completely different for the two tasks, thus the "malignance" cannot be equated for the two tasks.
2 Data Generation

In general, we are interested in creating a generalizable model that can predict the damage of drift in one single feature. In this way, we can estimate the impact on model metrics from adversarial attacks or unexpected distribution, or concept shifts.
To do this, we will first need to create a model-agnostic regressor (or classifier) that is trained only on the meta-data of the features and model summary of many datasets. We won’t use the actual datasets themselves since not only are they often too high dimensional, and the data shapes are also different across different models. In previous literature, the authors often used dimensionality reduction techniques to do this. We are interested in explaining the feature summaries, so we opted not to do this.
Then, we will need to test the model against a separate, unseen dataset (and model) to see if they could predict an arbitrary model’s performance on this new dataset.
While the Predictor Model will not have access to each model’s training data point (or dimensionally reduced versions of data points), but we will allow for feature summaries and explainability metrics for features, such as summaries of the SHAP values, as we will explain later.
In other words, we will be generating training samples based on features, and each entry of training data for the Predictor Model will be generated solely from a single feature’s feature summaries and explainability. We will not know anything else about the data, or the potential shift that will occur: only the loss in performance the model experienced. See Figure 2 for more detail.
3 Drift Generation
We generated drifts by progressively increasing the Gaussian noise added to each of the features. While progressiveness is not necessary, this ensures that the noise is random but still reflects changes in the real world.
There are several key constants we hold to maintain data integrity. First, we only perturb one feature at a time so we can isolate the malignancy score of the drift on this feature. Second, we do not refit the model after the initial data split, but simply use it to predict malignancy on the drifted data. Third, all additive noises are normalized with respect to each features’ standard deviations.
After the above steps for drift generation, we move on to the feature generation part mentioned in Data Generation.
4 Predictor Explainability Features
After we generate the drift, we need to come up with feature summaries for models and each feature. The simple features include data distribution data, and model summary data (i.e Tree Model, Neural Model, Linear Model, etc).
The more complex feature we included are summaries of model explainability metrics (or feature importance metrics). We chose to use SHapley Additive exPlanations (SHAP), a game-theoretical approach to explain feature importances. Although there are other methods for generating explainability, we chose SHAP since the generation method is not model dependent, and thus it can be generated for all types of models. See an example of SHAP features generated from the California Housing Dataset on an MLP Regressor Base Model.

Intuitively, we can think of the feature importance metrics as "weights biased" towards one side of the regression (or classification) whenever a model shifts. Therefore, since SHAP value generation can be model agnostic, we can generate such features for any model and any dataset.
Lastly, feature normalization across each dataset is critically important, since each dataset is completely different. While normalization minimizes the differences across features, it can’t be done for certain model features, such as coefficient values for linear models. For such model summary features, we had to transform and normalize some of them differently, such as through normalization with respect to feature distribution quantiles.
5 Experiments and Results
For our experiments, we used several datasets built into either Google Colab or through sklearn. Our choice of test set is to use an unseen dataset for the model to predict the malignancy of the drift.
We first look at Regression Predictors (i.e predicting the actual amount of model degradation due to drift).
For regression base model malignancy, we used the diabetes dataset, boston housing price dataset, and california housing dataset. For the classification base model malignancy, we used the iris dataset, wine quality dataset, and breast cancer dataset. We used the best results across multiple regression models.

We then use the same set of features for Classification Predictor (i.e predicting top k% of the most vulnerable features by quantile).

We can see that the accuracies vary based on the quantiles of features we are interested in predicting. In general, we found that the Classification Predictor (regardless of Base Model) has much better relative success, and the range of accuracy is between 55% to 90% for every dataset. This intuitively makes sense, since by simply looking at feature importances (SHAP summary), we can often tell which features our model highly dependent on.
6 Explainable AI: Using Explainability Metric to Evaluate Explainability Features
Models aside, we decide to use SHAP to explain our Predictor Models, which uses SHAP as many of its features. As it turns out, SHAP value is an important predictor of model degradation. See Figure 6 for details.

Although ironically recursive, just like the title "Regression for Regression", this insight gives us many reasons to care more about our model explainability. This brings us back and highlights the subtext of this blog:
Not only is offering AI Transparency and Explainability the most responsible thing to do, but it can also help us find ways to improve our data driven products and offer values to customers.
7 Lessons Learned and Future Work
While it is possible for us to simply manually drift features of new data every time, and calculate the drift malignancy that way, such drift will almost always have human biases similar to ones we possibly unintentionally introduced by our choice of data set and models.
We want to end off by noting that we intended for this blog to showcase that it is possible to estimate the damage of drifts without understanding the underlying structure of that drift (we only encoded explainability, feature summary information, and model summary information related to the training set, but nothing about the drift type itself).
In a non-academic setting where we have access to large data lakes containing real world, detectable, and non-random data drift, we can use the same method to estimate potential damage similar drifts could do to our model and take precautionary measures, such as fitting a second model for when the drift is detected or boosting the model. In the future, we hope to explore other features that can better summarize models and features.