Model Interpretability

In the field of Artificial Intelligence, the trade-off between accuracy and interpretability is a crucial aspect when developing a Machine Learning pipeline. Accuracy refers to the correctness degree of model predictions, while interpretability specifies how easy is the human understanding of the results. The balance between the two is related to the final business needs.
When we try to explain the output of our ML pipeline we should consider a crucial aspect. Interpretability may not result in explainability. The possibility to extract insights from the predictions may be useless if people haven’t adequate knowledge to understand them. This may be the case with some model-agnostic techniques that can be attached in the final step of the flow to provide interpretability outcomes. Despite that they can be very powerful tools, they may be too much for non-technical people if not properly used. The knockout blow comes if we try to explain the heuristic behind the interpretability scores computation.
The best possible solution is to stay simple. **** Models exhibiting the best and simplest explicative insights are the linear or tree-based algorithms. They are also two of the most known and easy-to-understand algorithms. Linear Regression (aka Logistic Regression in classification contexts) provides interpretability outcomes inspecting the magnitude of the coefficients. In the same manner, we can extract from Decision Trees the learning history in the form of decision rules.
For the most, simplicity is synonymous with low predictive power. This is a wrong and hasty consideration. Performances are relative and dependant on the domain of analysis, quality of data, and the correct selection of a proper validation strategy. What remains unchanged is the explicative ability of the instruments at our disposal. So to leverage the best from the simplest methods, both in performances and explicative power, we try to combine them.
In this context, we introduce linear-tree: a python library to build Model Trees with Linear Models at the leaves. The package provides simple BaseEstimators, in sklearn style, to wrap every linear estimator available in sklearn.linear_model
and build an optimal tree structure. During the training process, the best splits are evaluated fitting linear models on the dataset received. The final model consists of a tree-based structure with linear models in the leaves. In other words, multiple linear regressions are computed partitioning the data according to simple decision rules.
In this post, we squeeze the capabilities of linear-tree to provide useful explicative insights. We make this possible by carrying out a predictive task. As a result of the training process, we can retrieve completely for free some very simple and powerful interpretable outcomes.
THE DATA
We collect a financial dataset from Kaggle. With this data in our hands, we aim to build a predictive algorithm to quantify the probability that somebody experiences financial distress in the next two years. We have at our disposal different numerical variables that can be used to feed our model:
RevolvingUtilizationOfUnsecuredLines
: Total balance on credit cards and personal lines of credit divided by the sum of credit limits percentage;age
: Age of borrower in years;NumberOfTime30-59DaysPastDueNotWorse
: Number of times the borrower has been 30-59 days past due in the last 2 years;DebtRatio
: Monthly debt payments divided by monthly gross income;MonthlyIncome
Monthly income;NumberOfOpenCreditLinesAndLoans
: Number of Open loans (installment like car loan or mortgage) and Lines of credit (e.g. credit cards);NumberOfTimes90DaysLate
: Number of times borrower has been 90 days or more past due;NumberRealEstateLoansOrLines
: Number of mortgage and real estate loans;NumberOfTime60-89DaysPastDueNotWorse
: Number of times borrower has been 60-89 days past due in the last 2 years;NumberOfDependents
: Number of dependents in family excluding themselves.
The exogenous variables are self-explainable and suit well the purpose of our experiment.
MODELLING AND EXPLAINING
Our scope is to fit a Linear Tree to produce easy-interpretable outcomes. For this simulation, we use all the data at our disposal for training. We use the RidgeClassifier
from sklearn as linear estimators to build our tree structure. Different parameter configurations may produce different tree structures, which results in different outcomes. It’s our duty to choose and validate the best one.

We are particularly interested in the leaves. As in the classical Decision Trees, each leaf is the result of the pruning process applied in each explored path. When there is no more gain, in terms of loss reduction from splitting the data into further partitions, the growing process stops. We end with a ‘clustered’ version of our training data partitioned into different groups/leaves according to simple decision rules.

Each leaf contains also a fitted Linear Model. We are building a Linear Tree, and the splits are evaluated considering the weighted sum of the training losses from left and right child candidates. We end having a fitted Linear Model in each leaf that predicts all the samples that satisfy a given set of decision rules.

At this point, we can inspect the results. Querying the Linear Models, we extract the coefficients and see the impact of each variable in the leaves. This is not all. We can also obtain the set of decision rules built during training. They are useful to identify in which leaf a given sample falls. From the combination of decision rules and the coefficients, we have a complete overview of the decision path followed by our model.




SUMMARY
In this post, we introduced the linear-tree package as a tool to build Linear Trees. They can be considered a valuable solution to understand the relationship between features and provide interpretable outcomes. Linear Trees leverage the combination of Decision Trees and Linear Models to help us better interpret our predictions. However, like all the other algorithms, they require proper tuning and data understanding to achieve the desired performances.
Keep in touch: Linkedin