Is Interpreting ML Models a Dead-End?

The interpretation process can be detached from the model architecture

Published in

Towards Data Science

6 min readApr 16, 2022

Is Interpreting ML Models a Dead-End? (Image by author)

Models are nowadays our primary tool to understand phenomena around us, from the movement of the stars to the opinion and the behavior of social groups. With the explosion of machine learning (ML) theory and its technology, we have been equipped with the most powerful tool in the history of science to understand a phenomenon and predict its outcome under given conditions. By now, we are able to detect fraud, design transportation plans, and made progress on self-driving cars.

With the potential of machine learning to model a phenomenon, the problem of its complexity has stood over its democratization. While many models have the unquestionable ability to give us the predictions we are looking for, their use in many industries is still limited for reasons such as lack of computational power or limited software availability. One other reason, with little discussion as a limiting factor, is the claimed impossibility to interpret highly complex black-box or deep learning (DL) models. In this claim, many practitioners find themselves making a compromise for lower prediction accuracy with higher model interpretability.

We have made unprecedented progress in the methodology of statistical or machine learning, coming from classical statistical paradigms to the current deep learning solutions. The problem of interpretability arose with the methodological evolution, drawing an inverse relationship between the capacity to interpret the model and its accuracy for prediction. This trade-off has come to divide the community between those who implement models for interpretation and those who adopt models to provide punctual predictions. Such separation seems to have established a dead-end for the users, forcing them to adopt one of the two paths: interpretability or accuracy.

Is it really a dead-end?

Models are machines. Just like cars or computers, machine learning models are machines that are built with some material (data and algorithms) and are used by feeding them an input to obtain an output. When we drive a car, we combine different input features, such as gears, steering wheel position and accelerator, and the car machine will do something, it will start moving or not depending on the combination of these features. Likewise, when we use a computer, we have a mouse, a keyboard or a microphone which we operate to provide the computer with an input and the computer will show the corresponding output on a screen or a speaker. While most of us are not engineers who would know in detail every single piece inside a car or a computer, we are generally able to learn how to operate them by understanding the relationship between the input components and the expected output. Hence, if our models are machines just like so, understanding or interpreting them might not be, after all, about the engineering inside of them.

Until now, the classical paradigm of statistics educated us with the concept that interpreting a model is about building a story around the coefficients of such a model. We have examples such as linear regression where the interpretation is about contextualizing the beta coefficients of the model, or logistic regression where the so-called “odds ratio” has been the best arrangement of coefficients to find a narrative about the model. With this conception, interpreting a deep learning model, which might be composed of hundreds of layers with millions of coefficients, does look like a dead end. As it turns out to be, finding narratives around the numbers that compute the predictions in ML models would be equivalent to understanding the basic operation of transistors to interpret a computer. We might be able to redefine what the process of interpreting a model is about.

Architecture of a linear model (Image by author)

Example of the architecture of a deep learning model. (Image by author)

Let’s ask the neural network what we would ask about a physical machine: What happens if?

We focus here on one practical and effective solution to achieve the interpretability of the models, regardless of the mathematical equation or algorithm that is built underneath: Play with them by asking what happens if. This type of interpretation is usually called “what-if”. Because our models provide a prediction given a combination of features as input, we have the ability to design different combinations of inputs and observe the output given by the model. If we are to do this with a linear model, linear changes in the input will provide linear changes in the output. If we are to do this with a DL model, linear changes in the input will reveal the types of nonlinear changes in the output. This is when the “what-if” interpretation of the model takes place. Note that the user of the model is capable of defining the combination of inputs that are fed to the model in order to connect the observed changes. Such combinations of input features are a problem subject to context knowledge and no longer a problem of the required engineering to build the models.

An example

Here we describe an example of a model that takes as input spectral signals of mango fruits to predict their dry matter content. This example is very typical in modern quality control processes where products are monitored with devices that collect a signal and a prediction of the chemical information of the product is to be retrieved. The inputs can be visualized as signals, where each wavelength of the signal is one input feature. With them, the predictions are calculated rendering continuous values of the chemical information. We have here a visualization to provide an interpretation of a linear and a DL model using the same data. A linear range of changes in one of the wavelengths was defined to observe the changes in the predicted chemical values. Naturally, the changes in the predicted values of the linear model are linear. In the case of the DL model, we detected the type of nonlinearity that is caused in the predictions by the linear changes in the input feature. Using the DL model, the chemist now knows that the dry matter content changes parabolically at wavelength 993 nm.

Example of linear model interpretation. Linear changes at wavelength 933 for input (left) and linear changes in the predicted values (right). The color represents the scale of changes. (Image by author)

Example of DL model interpretation. Linear changes at wavelength 933 for input (left) and detected parabolic changes in the predicted values (right). The color represents the scale of changes. (Image by author)

The interpretation process is not about narrating the coefficients of the model

The interpretation process is not about narrating the coefficients of the model. It transcends the architecture or engine underneath the model. Other machine learning models, as deep as they can be, can be used in a what-if way or in other ways. The process of interpretation of models has been classically limited to narrating the numbers of the model architecture, closing the doors for the effective use of a prediction model to understand it. A what-if type of interpretation may be the most basic interpretation tool. Many other strategies can be defined depending on what is to be understood about the model.

Several insights about model interpretation for different types of statistical models and algorithms can be found in Interpreting the model is for humans, not for computers. The process of model interpretation can be detached from the stages of model building and model deployment. While the model building stage takes care of finding an approximation of the underlying mechanism that connects our input features to the desired output, the stage of model interpretation after the model is built is a process that needs the knowledge of the expert researcher who is able to ask the model the precise questions in order to understand the connections the model is creating. Driven by such questions, we can create exploratory tools, such as the visualization presented in the example, which enable the interaction between the researcher and the model.

Acknowledgment:

The CNN model for the DL example was taken from the repository https://github.com/dario-passos/DeepLearning_for_VIS-NIR_Spectra

Is Interpreting ML Models a Dead-End?

The interpretation process can be detached from the model architecture

Is it really a dead-end?

Let’s ask the neural network what we would ask about a physical machine: What happens if?

An example

Written by Valeria Fonseca Diaz