Using POTATO for interpretable information extraction

Adam Kovacs
Towards Data Science
11 min readFeb 3, 2022

--

Image by author

About

This article is an introduction to the POTATO library. POTATO is a language independent human-in-the-loop XAI (explainable AI) framework for extracting and evaluating interpretable graph features for any classification problem in Natural Language Processing (NLP).

The article includes:

  • A short introduction to rule-based methods for text classification
  • Introduction to defining graph patterns in POTATO
  • Learning patterns automatically
  • The human-in-the-loop (HITL) framework

Introduction

Currently, text processing tasks (as many other domains) are dominated by machine learning models. But as the parameters of these models are growing exponentially, the explainability decreases.

Interpretable models among others have the following traits [1]:

  • Fairness unbiased predictions
  • Privacy — less information leakage
  • Reliability small changes in the input do not affect heavily the output
  • Trust, Auditability —we know what caused the predictions
  • Debuggable — if an error happens in production, we can change the model

Recent NLP solutions achieving state-of-the-art results on public benchmarks rely on deep learning (DL) models with millions of parameters (e.g. BERT [2]). These models require large amounts of training data, and it is hard to explain their decisions [3]. Also, deep learning models pose a risk of learning unintended bias from datasets [4]. Rule-based systems may provide accurate and transparent solutions, but can be challenging and time-consuming to build and maintain. POTATO is a fast prototyping framework that supports the creation of rule-based text classifiers. In POTATO, instead of using machine learning models to learn the task directly, we learn the rule-systems instead. Using this method, the final model remains completely transparent.

Rule-based systems

Pros

  • Rule-based systems are interpretable and explainable by design
  • Are popular in “real-world” applications and there is no need for large upfront investments (no need for GPUs)
  • Fully-customizable and can be debugged

Cons

  • Hard to maintain
  • Worse performance on benchmarks (benchmarks are dominated by DL methods)
  • Domain expertise is needed
  • Time-consuming to maintain and to develop

In POTATO We try to solve some of the drawbacks of rule-based models by combining machine learning and rule-systems: Learn rules!

To demonstrate rule-systems, we will use an example from the Semeval 2010 Relation Extraction dataset. Relation extraction (RE) is the task of extracting semantic relationships between entities from a text. Usually is defined between two entities. The relations have semantic categories (e.g. Destination, Component, Employed by, Founded by, etc..), and the task is to classify the relation into the correct category.

We will only use the Entity-Destination label. An example for the class:

  • The diamond ring(entity1) was dropped into a trick-or-treater’s bag(entity2).

To define a rule, we could just use a simple regular expression:

r”entity1 .* dropped into .* entity2"

But using only regexes is a naive method as we don’t know anything about the structure of the text (tokens, grammatical categories, etc..). We could use more advanced python packages like spaCy’s TokenMatcher or the Holmes Extractor. With them we would be able to define a more complex rule that takes part-of-speech (POS) tags into account (the grammatical category of words):

pattern = [{‘POS’: ‘VERB’},{‘LOWER’: ‘into’},{‘TEXT’: {‘REGEX’: ‘.*’}},{‘LOWER’: ‘entity2’}]

But instead of writing rules on the tokens of the text, we can write rules on graphs that could utilieze the underlying graph structure of texts. In POTATO we use the networkx python package to represent graphs. With networkx we provide a unified interface for graph representation and the users can write graph patterns on arbitrary graphs. In POTATO currently we support three types of graphs: AMR [5], UD (using the stanza package [6]) and 4lang [7]. An example pattern can be seen in Figure 1 and in Figure 2 we can see the 4lang graph of the example we defined above (The diamond ring was dropped into a trick-or-treater’s bag) and the applied pattern.

Figure 1: example pattern defined on a graph, the ANY node means that we match any string in that node (Image by author)
Figure 2: 4lang [7] semantic graph built from the text and the applied feature (Image by author)

As opposed to simple regex patterns, this pattern would also match the following examples:

The man placed the entity1 into the entity2.
Industries have pushed entity1 into fragile marine entity2.
I am putting the entity1 into a MySQL entity2.
The entity1 were released into the entity2.

Usage and setup

POTATO is a human-in-the-loop XAI framework written in python that provides:

  • a unified networkx interface for multiple graph libraries (4lang, stanza, AMR)
  • a python package for learning and evaluating interpretable graph features as rules
  • a human-in-the-loop (HITL) UI framework built in streamlit
  • REST-API to use extracted features for inference in production mode

All of our components are open-source under MIT license and can be installed with pip.

The tool is heavily dependent upon the tuw-nlp repository for building graphs and matching features. You can install tuw-nlp with pip:

pip install tuw-nlp

Then follow the instructions to setup the package.

Then install POTATO from pip:

pip install xpotato

First import packages from potato:

from xpotato.dataset.dataset import Dataset
from xpotato.models.trainer import GraphTrainer

We will demonstrate POTATO’s capabilities with a few sentences manually picked from the Semeval dataset [8].

Table 1: Example sentences from the Semeval 2010 Relation Extraction dataset [8]

Note that we replaced the two entitites in question with XXX and YYY.

Then, the next step is to initialize the dataset and also provide a label encoding. Then parse the sentences into graphs, for this we can use the parse_graphs() method (also need to select the graph format). Currently we provide three types of graphs: ud, fourlang, amr. Also you provide the language you want to parse. Currently we support English (en) and German (de).

We will use the example from Table 1 (we will refer the samples with their ids specified in the first column). The examples initialized in python can done with the following code:

dataset = Dataset(sentences, label_vocab={"Other":0, "Entity-Destination(e1,e2)": 1})
dataset.set_graphs(dataset.parse_graphs(graph_format="fourlang"))

Check the dataset:

df = dataset.to_dataframe()

We can also check any of the graphs parsed

from xpotato.models.utils import to_dot
from graphviz import Source
Source(to_dot(dataset.graphs[0]))
Image by author

Writing Rules with POTATO

If the dataset is prepared and the graphs are parsed, we can write rules to match labels. We can write rules either manually or extract them automatically (POTATO also provides a frontend that does both).

The simplest rule would be just a node in the graph (into in this case):

# The syntax of the rules
# List[List[rules that we want to match]
# List[rules that shouldn't be in the matched graphs]
# Label of the rule
rule_to_match = [[["(u_1 / into)"], [], "Entity-Destination(e1,e2)"]]

Init the rule matcher:

from xpotato.graph_extractor.extract import FeatureEvaluator
evaluator = FeatureEvaluator()

Match the rules in the dataset:

# The function will return a dataframe with the matched instances:
evaluator.match_features(df, rule_to_match)

The function will return a dataframe with the matched examples. This rule will match any sentence that has the node into in it. In our case we would match examples numbered 0, 1, 2, 3, 4, 5, 6, 9, 14 from Table 1 (e.g. The scientists poured XXX into pint YYY.)

One of the core features of our tool is that we are also able to match subgraphs. To describe a graph, we use the PENMAN notation.

E.g. the string (u_1 / into :1 (u_3 / pour)) would describe a graph with two nodes (“into” and “pour”) and a single directed edge with the label “1” between them. Describing a subgraph with the string (u_1 / into :1 (u_2 / pour) :2 (u_3 / YYY)) will return only three examples instead of 9, when we only had a single node as a feature.

#match a simple graph feature
evaluator.match_features(df, [[[“(u_1 / into :1 (u_2 / pour) :2 (u_3 / YYY))”], [], “Entity-Destination(e1,e2)”]])

This feature will match on examples 0, 1, 9.

We can also add negated features that we don’t want to match (this won’t match the first row where ‘pour’ is present):

# match a simple graph feature with a negated feature. 
#The negated features go into the second parameter.
evaluator.match_features(df, [[[“(u_1 / into :2 (u_3 / YYY))”], [“(u_2 / pour)”], “Entity-Destination(e1,e2)”]])

Matching examples 2, 3, 5, 6.

If we don’t want to specify nodes, regex can also be used in place of the node and edge-names:

# regex can be used to match any node (this will match instances 
# where ‘into’ is connected to any node with ‘1’ edge)
evaluator.match_features(df, [[[“(u_1 / into :1 (u_2 / .*) :2 (u_3 / YYY))”], [], “Entity-Destination(e1,e2)”]])

We can also refine regex rules from training data, this will automatically replace regex ‘.*’ with nodes that have high precision.

evaluator.train_feature("Entity-Destination(e1,e2)", "(u_1 / into :1 (u_2 / .*) :2 (u_3 / YYY))", df)

This returns (u_1 / into :1 (u_2 / push|pour) :2 (u_3 / YYY)) (replaced ‘.*’ with push and pour)

Human-in-the-loop rule learning

The idea of POTATO is:

  • Use subgraphs as features for training simple classifiers (LogReg, Random Forest, etc.)
  • Generate subgraphs only up to a certain edge number (to avoid large number of features)
  • Suggest rules to users based on feature importance
  • With the UI user can accept, reject, edit, combine patterns
  • Subgraphs may have regexes as node or edge labels
  • Underspecified subgraphs can be refined

To extract rules automatically from a labeled dataset train the dataset with graph features and rank them based on relevancy:

from sklearn.model_selection import train_test_splittrain, val = train_test_split(df, test_size=0.2, random_state=1234)trainer = GraphTrainer(train)features = trainer.prepare_and_train(min_edge=1)

The features variable will contain the automatically extracted rules:

(u_15 / into :1 (u_26 / push))
(u_15 / into :1 (u_19 / pour :2 (u_0 / xxx)))
(u_15 / into :1 (u_19 / pour))
(u_19 / pour :2 (u_0 / xxx))
(u_15 / into :2 (u_3 / yyy))

The UI

Besides the backend described in the previous sections, POTATO also comes with a HITL user interface allowing a user to extract rules from a dataset. For launching the HITL user interface we need to load a dataset as a set of labeled or unlabeled graphs. Any directed graph can be loaded aside from our predefined formats (ud, 4lang, amr). Suggesting and evaluating rules requires ground truth labels (using the feature learning method described in the previous section), if these are not available, the UI can be launched in advanced mode for bootstrapping and annotating labels using rules. Once a dataset is loaded, the HITL frontend can be started and the user is presented with the interface shown in Figure 3, built using the streamlit library.

The frontend shown in Figure 3 provides:

  • 1 — The dataset browser that allows the user to view the text, graph, and label for all rows of the dataset. The viewer renders graphs using the graphviz library, and also provides the PENMAN notation that can be copied by the user for quick editing of rules.
  • 2 — Users can choose the class to work on (handful if dealing with multi-label classification).
  • 3 — The list of rules constructed for each class are maintained in a list and they can be modified, deleted, or new features can be added.
  • 4 — Rules can be viewed and evaluated on the training and validation datasets.
  • 5 — Prediction of each rule can be analyzed by viewing true positive, false positive, or false negative examples.
  • 6 — The button suggest new rules returns a list of suggested graphs together with their performance on the training data, allowing the user to select those that should be added to the rule list, this interface is shown in Figure 4. For rules containing regular expressions, the Refine button will replace regular expressions with a disjunction of high-precision labels. This function is implemented using the method described in the previous section.
Figure 3: The main page of POTATO allows the user to 1 browse the dataset and view the processed graphs, 2 choose the class you want to build rule-based systems on, 3 modify, delete, add new rules and get suggestions, 4 view the results of the selected rules, 5 view example predictions for each rule (Image by author)
Figure 4: Patterns suggested by POTATO, ranked by precision (Image by author)

As described, the frontend is a streamlit app, we can start it with the training and the validation datasets. First save them with the following code:

train.to_pickle(“train_dataset.pickle”)
train.to_pickle(“val_dataset.pickle”)

Then the streamlit app can be started with a single line of command:

streamlit run frontend/app.py -- -t train_dataset.pickle -v val_dataset.pickle

Rules for each class are automatically saved to disk in JSON format, this file can be loaded for further editing or for inference.

streamlit run frontend/app.py -- -t notebooks/train_dataset -v notebooks/val_dataset -hr features.json

Advanced mode

If labels are not or just partially provided, the frontend can be started also in advanced mode, where the user can annotate a few examples at the start, then the system gradually offers rules based on the provided examples.

Then, the frontend can be started:

streamlit run frontend/app.py -- -t unsupervised_dataset -m advanced

Evaluate

If you have the features ready and you want to evaluate them on a test set, you can run:

python scripts/evaluate.py -d test_dataset.pickle -f features.json

The result will be a csv file with the labels and the matched rules.

Service

If you are ready with the extracted features and want to use our package in production for inference (generating predictions for sentences), we also provide a REST API built on POTATO (based on fastapi).

First install FastAPI and Uvicorn

pip install fastapi
pip install "uvicorn[standard]"

To start the service, you should set language, graph_type and the features for the service. This can be done through enviroment variables.

Example:

export FEATURE_PATH=/home/adaamko/projects/POTATO/features/semeval/test_features.json
export GRAPH_FORMAT=ud
export LANG=en

Then, start the REST API:

python services/main.py

It will start a service running on localhost on port 8000 (it will also initialize the correct models).

Then you can use any client to make post requests:

curl -X POST localhost:8000 -H 'Content-Type: application/json' -d '{"text":"The suspect pushed the XXX into a deep YYY.\nSparky Anderson is making progress in his XXX from YYY and could return to managing the Detroit Tigers within a week."}'

The answer will be a list with the predicted labels (if none of the rules match, it will return “NONE”):

["Entity-Destination(e1,e2)","NONE"]

The streamlit frontend also has an inference mode, where the implemented rule-system can be used for inference. It can be started with:

streamlit run frontend/app.py -- -hr features/semeval/test_features.json -m inference

Conclusion

POTATO enables the fast construction of rule-based systems and provides a transparent, explainable, and auditable alternative solution to deep learning models on NLP tasks. If you want to read more on the framework or want to try it out you can check the following sources:

References

[1] Doshivelez et al., Towards A Rigorous Science of Interpretable Machine Learning, (2019)

[2] Devlin et al., Pre-training of Deep Bidirectional Transformers for Language Understanding, (2019)

[3] Serrano et al., Is Attention Interpretable?, (2019)

[4] Bender et al., On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?, (2021)

[5] Banarescu et al., Abstract Meaning Representation for Sembanking,(2013)

[6] Qi et al., Stanza: A Python Natural Language Processing Toolkit for Many Human Languages, (2020)

[7] Kornai et al., Semantics (2019)

[8] Hendrix et al., SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations between Pairs of Nominals (2010)

--

--

NLP engineer and PhD student at TU Wien. Interested in semantic parsing and explainable and interpretable solutions in NLP.