The world’s leading publication for data science, AI, and ML professionals.

A Brief Introduction to Contextual Explanation Networks

Carlos Rodriguez

CEN’s learn to simultaneously predict and explain

Photo by JJ Ying on Unsplash
Photo by JJ Ying on Unsplash

With the steady adoption of Artificial Intelligence and Machine Learning algorithms used in the real-world (Mckinsey, 2020), there is a demand for more adequate methods to interpret or explain an algorithm’s prediction(s). These algorithms influence critical decisions in finance, healthcare, and criminal justice. Creditworthiness, medical diagnoses, or judicial verdicts can be determined largely by a machine (Brotcke, 2020; Rudin, 2019). In many cases, humans cannot fully interpret or explain how a critical decision was made.

To address this type of algorithmic opaqueness, a small team of researchers from Carnegie Melon and Google has put forward Contextual Explanation Networks (CENs). Contrary to prior explanation methods, CEN’s learn to simultaneously predict and explain, providing qualitative insight and often improving the original predictions’ performance without incurring additional computational overhead (Al-Shedivat et al., 2020).

Squeezing the LIME

The concept of explaining a model is not a new one. In 2016, researchers from the University of Washington introduced Local Interpretable Model-agnostic Explanations, commonly known as LIME (Ribeiro et al., 2016). LIME proposed supplementing an algorithm with textual or visual artifacts that might provide a qualitative understanding of its behavior. More broadly, the goal was to offer human-interpretable representations of data as it passed through the algorithm. For example, algorithms used to understand language (e.g., Google Translate) represent words as points in a highly dimensional vector space (Bengio et al., 2006). A more human-interpretable representation is a simple binary vector signaling a word’s presence or absence (Ribeiro et al., 2016).

While LIME and other agnostic explanation methods may provide some transparency, explanations are learned independently and are never guaranteed to be the basis for the underlying predictions. These drawbacks can lead to erroneous interpretations or misrepresentations of the original algorithm’s methodology (Al-Shedivat et al., 2020). Similarly, popular agnostic methods are susceptible to exploitation merely by introducing noise or randomized data (Al-Shedivat et al., 2020; Kim et al., 2017). In contrast, the Contextual Explanation Network does not carry forward any of its predecessors’ inherent drawbacks.

Offloading Complexity to Encoders

Example of a CEN architecture (Al-Shedivat et al., 2020)
Example of a CEN architecture (Al-Shedivat et al., 2020)

Generally, CEN’s work in two steps. First, a subset of the inputs to the original algorithm generates a simple probabilistic model (e.g., sparse linear model) defined by a domain expert (Al-Shedivat et al., 2020). Then, the generated model is applied to another subset of inputs and produces a prediction. In short, CENs can represent complex model classes using powerful encoders. By offloading complexity onto the encoding process, simple conclusions are drawn between variables and predictions (i.e., a linear model) (Al-Shedivat et al., 2020). In other words, all of the complexity distills into high-level concepts that fit the context of the problem.

"CENs can represent complex model classes by using powerful encoders. At the same time, by offsetting complexity into the encoding process, we achieve simplicity of explanations and can interpret predictions in terms the variables of interest ."

  • Al-Shedivat et al., 2020

Measuring Success

The team mainly set out to accomplish two goals. Defining high-performing, inherently interpretable deep learning architectures and proving how agnostic explanations fall short. Evidence was collected through empirical analysis of CEN vs. alternatives when applied to common use-cases (e.g., computer vision, sentiment analysis).

The team measured their success using two metrics—predictive performance against baselines and qualitative insights. In one example, a classifier (using satellite images as input) was pre-trained using the highly complex VGG-F embeddings. Logistic Regression and Multilayer Perceptrons (MLP) were chosen as baselines. The CEN models outperformed both simple logistic regression and MLP while also sharply identifying a contextual and meaningful explanation (Al-Shedivat et al., 2020).

Ultimately, the experiments proved that it is possible to map from complex low-level inputs to high-level meaningful variables (e.g., categorical features) pre-defined by domain experts. The algorithm’s prediction paired with a human-interpretable mapping effectively makes it transparent. The team managed to accomplish this without incurring additional overhead and, in some cases, with improved performance.

CENs and the "Black Box" problem

The benefits of using CEN in place of agnostic explanations like LIME are clear. The added performance, net-zero overhead, and linear interpretability make CENs a practical alternative. However, CENs may only partially address the "black box" problem that post-hoc explanations are commonly deployed to solve.

The moniker "Black Box" is given to an algorithm under two discrete circumstances. The first is when its underlying mathematical computations are beyond human comprehension (Rudin, 2019). The second circumstance is when the algorithm is proprietary, and the methodology is secret.

The former is addressed elegantly by CENs with, at minimum, as much explainability as its predecessor LIME. The research does not directly address the latter. CEN architecture outputs a complete solution and would not be applied post-hoc to explain a proprietary one.

The arguments for the use of proprietary solutions are substantial. Native Machine Learning applications require multidisciplinary capabilities, infrastructure, and ongoing monitoring and maintenance. It is often less prohibitive to rely on a third-party. The obvious (but hardly inconsequential trade-off) is opaqueness. It is challenging to ensure their reliability, robustness, and absence of undesired biases (Al-Shedivat et al., 2020). Like many other emerging interpretability methods, CENs intend to replace opaque proprietary solutions in favor of transparency.

Although the research very clearly demonstrated its value over LIME. CEN was not evaluated directly against other post-hoc model-agnostic methods. Practitioners may accept that LIME is generally representative of many other widely-used explanation methods, but not all explanatory models are a good match for all types of models (Hall et al., 2021). Still, the research does not disambiguate between it and comparable approaches (Guidotti et al., 2019).

Other contemporary approaches present many of the same benefits that CEN does. For example, a research team at Duke proposed appending a special prototype layer to an algorithm’s existing training process. The final predictions are a weighted sum of similarities to the human-interpretable prototype (Chen et al., 2019). In 2017, Google introduced testing with Concept Activation Vectors (TCAV), which shares the idea it is possible to map complex low-level features to meaningful high-level concepts (Kim et al., 2017). Either of these examples can serve as useful benchmarks or suitable alternatives to CEN.

Adaptable and Efficient

In my view, the Contextual Explanation Network architecture is a meaningful contribution to interpretability research. This method (and others like it) make less relevant the perceived trade-offs between transparency and accuracy. The adaptability of CEN to existing algorithms and its computational efficiency makes it an accessible option for practitioners who value algorithmic transparency.


References

Al-Shedivat M, Dubey A, Xing EP. 2017. Contextual Explanation Networks. arXiv [csLG]. [accessed 2021 Jan 28]. https://www.jmlr.org/papers/volume21/18-856/18-856.pdf.

Bengio Y, Schwenk H, Senécal J-S, Morin F, Gauvain J-L. 2006. Neural probabilistic language models. In: Innovations in Machine Learning. Berlin/Heidelberg: Springer-Verlag. p. 137–186.

Brotcke L. 2020. Modifying model risk management practice in the era of AI/ML. Journal of Risk Management in Financial Institutions. [accessed 2021 Feb 6]. https://hstalks.com/article/5661/modifying-model-risk-management-practice-in-the-er/.

Goodfellow I, Bengio Y, Courville A. 2016. Deep Learning. London, England: MIT Press.

Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. 2019. A survey of methods for explaining black-box models. ACM Comput Surv. 51(5):1–42.

Hall P, Gill N, Kurka M. Machine Learning Interpretability with H2O Driverless AI. H2o.ai. [accessed 2021 Feb 8]. http://docs.h2o.ai/driverless-ai/latest-stable/docs/booklets/MLIBooklet.pdf.

Kim B, Wattenberg M, Gilmer J, Cai C, Wexler J, Viegas F, Sayres R. 2017. Interpretability beyond feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). arXiv [statML]. http://arxiv.org/abs/1711.11279.

Lundberg S, Lee S-I. 2017. A unified approach to interpreting model predictions. arXiv [csAI]. http://arxiv.org/abs/1705.07874.

Rudin C. 2019. Stop explaining black-box machine learning models for high-stakes decisions and use interpretable models instead. Nat Mach Intell. 1(5):206–215.

Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv [csCV]. http://arxiv.org/abs/1409.1556.

The state of AI in 2020. Mckinsey.com. [accessed 2021 Feb 6]. https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/global-survey-the-state-of-ai-in-2020.


Related Articles