Are Expert Systems Dead?

A review of recent trends, use cases and technologies

Professor Simon J. Preis, Ph.D.
Towards Data Science

--

Photo by Nghia Le on Unsplash

Introduction

Everyone is talking about machine learning. And this is not just a feeling: if you search Google for “machine learning”, you will receive almost 700 million results. But what about expert systems? Traditionally, it is the other side of the AI coin. Well, Google returns at least around 7 million results, but there is a clear gap between those two concepts. This mismatch fits also to the Google search trends: while interest in machine learning (orange line) has increased significantly in recent years, one can get the impression that expert systems (blue line) have somehow fallen into oblivion.

Google Search Trends in % (Author, Data from Google)

So is this just a result of the AI evolution (survival of the fittest and hopefully not the “overfittest” … sorry for that AI joke ;-)? Are expert systems really outdated? Or is there still room for expert systems in the one or the other research area? I thought it would be good to collect and share some findings and experiences in order to take a stand for expert systems!

Contents:

  1. what are expert systems?
  2. are expert systems relevant in topical research?
  3. what are use cases for expert systems?
  4. what are modern technologies for developing expert systems?

What are expert systems?

An expert system (ES) is a piece of software that consists at least of a knowledge (data-)base, a domain-related set of rules and an inference engine that is able to infer new axioms. It is called ES since it acts like a human expert in a particular domain, e.g. who is able to answer tricky questions related to his/her field of expertise. From application point of view, ES have two main types of users: a) the human experts who continuously review and modify the knowledge base and b) the end users who seek for answers to domain-related questions. Compared to often called “black-box AI” approaches from deep learning, the result of an ES is always transparent: the inference engine typically provides step-wise explanations for its results and the users are able to comprehend their logical construction.

Are expert systems relevant in topical research?

In order to clarify the question, if ES are really dead, it is worth to look at the number of publications that are employed with ES or at least reference that term. For that purpose, I did a simple search on Google Scholar and checked the number of publications from 2005 to 2022 that contain the term ES. Let’s formulate the following working hypothesis: the publication trend for ES over the selected years is negative (to confirm that “ES are dead”).

Now let’s check what the data shows. I have created a diagram that shows the absolute numbers of publications per time frame and a trend line. The diagram clearly shows that the publication trend is still positive. Though the literature search is very high-level and noisy, my expectation is that ES would not be named in topical articles if the term would be outdated; so we can reject the hypothesis stated above. We can imply that ES are still vital and play a crucial role in research.

Research Trend for Expert Systems (Author)

What are use cases for expert systems?

So what are these topical research disciplines that develop and propose ES? Which domains do they serve, which requirements do they fulfill, what is the main value of ES in those projects? Of course, these are great questions to be answered via systematic literature review — but for now, I will only present three selected projects that should demonstrate the bandwidth of ES.

Cyber Security

ES can be used to for automated security assessment of IoT ecosystems. Rak et al. (2022) proposed an ES that produces a threat model and a list of attack plans for each identified threat for a IoT system. The results provided by the ES can be used by penetration testers to perform a systematic security test of the target IoT infrastructure.

Project Management

Bhattacharya et al. (2022) proposed an ES for decision making in complex regulatory and technology implementation projects. For instance, the ES validates if the project start conditions are met and allows tailoring the project plan based on project type and complexity.

Clinical Decision Support

Chrimes (2023) proposed an ES to support clinical decisions for COVID-19. The ES interacts with users via a chatbot to determine the potential severity of a COVID-19 infection or the possible biological system responses and comorbidities that can contribute to the development of severe cases of COVID-19.

Modern Technologies for ES

Especially in pure research projects, we often see Protégé as ES development tool, also in topical projects. The tool combines storage of knowledge (entities and object properties), rules (SWRL), facts (individuals) and provides an inference engine as pre-installed plugin (HermiT). I still believe that Protégé is important in education and research, especially since it is free of charge and easy to start. But I would not recommend to use it in industry projects due to its limitations regarding system integration and user interface design. In my opinion, it is a tool for universities, not for companies.

However, there are modern tools that share some of the features of Protégé and that are industry-proven. Most important to name are graph databases which allow semantic storage and semantic analysis of information. So up to a certain degree, we can represent ontologies as knowledge graphs. For that purpose, neo4j provides a plugin called “neosemantics” for RDF, which is the data model standard for ontologies also in Protégé. So we can, for instance, create ontologies in Protégé and export them to neo4j for further software integration. The plugin neosemantics also provides an inference engine, however, we can do reasoning also “by foot”via Cypher, which is the standard query language for neo4j.

An example

But what is inferencing or reasoning? Let’s have a look at following database that consists of persons and genders. We see that there is a number of relationships that indicate the gender of a person (in a simplified binary world) with “IS_A” and also family relationship between person nodes with “IS_Parent”.

Initial Database (Author)

Now we can distinguish between explicitly stored knowledge and transitive knowledge. The explicit knowledge is what we see in the graph and what I have described in the paragraph above. However, from our everyday experience we know that there are more relationships between those persons based on this available information: siblings, grandparents, uncles etc. And instead of searching and/or entering all these relationships for all facts in the database manually, we can use the reasoning technique to create this knowledge automatically.

Grandparents

Let’s start with an easy one: grandparents are the parents of the parents of a person. So looking on our graph database, we are looking for multilevel “IS_Parent” relationships like PersonA →PersonB →PersonC. In this case, we would state that PersonC is a grandchild to PersonA and PersonA is a grandparent to PersonC. We can infer that information via following generic statement:

MATCH (a:Person)-[:IS_Parent]->(b:Person)-[:IS_Parent]->(c:Person) 
return a.name as grandparent, c.name as grandchild

The result of this statement is shown in following screenshot. We see that Max and Anna are both grandchildren of Betty. Why? Because Betty is a parent to Franz who is a parent to Max and Anna.

Grandparent Result (Author)

Siblings

Another type of transitive knowledge is the sibling relationship. We are searching for persons who have the same parents, this means we are looking for common “IS_Parent” relationships for PersonA ←PersonB and PersonC ←PersonB. We can query that information via following statement:

MATCH (a:Person)<-[:IS_Parent]-(b:Person),
(c:Person)<-[:IS_Parent]-(b:Person)
return distinct a.name as sibling_a, c.name as sibling_b

The result of this statement is shown in the following screenshot. We see that Max and Anna are siblings. Since the matching query fits for both perspectives, we get two result records. This is also technically valid, because neo4j does not support bidirectional or non-directed relationships. If Anna is a sibling to Max and Max is a sibling to Anna, then we have two separate relationships.

Sibling Result (Author)

Persisting Transitive Knowledge

There are use cases for such on-demand queries, but there are other use cases where we wish to store the new transitive knowledge to the database as well. This is especially useful, if multiple users work with the database — here it does not make sense to “hide” any transitive knowledge that is potentially business-relevant. In that case, we can use the merge clause to create new (unique) relationships between nodes as shown in the following statement:

Match (a:Person)<-[:IS_Parent]-(b:Person),
(c:Person)<-[:IS_Parent]-(b:Person)
Merge (a)-[r:IS_SIBLING]->(c)

Note: besides new relationships, we could also infer and persist new nodes, labels and properties.

If we are then querying the full database, we see the additional “IS_SIBLING” relationships between Anna and Max:

Modified database (Author)

It should be highlighted that this type of manual reasoning does not consider forward/backward chaining which are usually required features of inference engines, e.g. to support complex theorem proofs. So if you are interested in these features, you can start reading here (general introduction) and here (inferencing with neo4j).

Self-Learning Expert Systems

Now the question is if we as humans always need to know these rules upfront or have to prepare manually the rules we want to query in order to extract transitive knowledge. The short answer is: no! And this leads us to an interesting link between the two often separated worlds of data-based machine learning and rule-based expert systems: we can apply decision trees to build rules from empirical data that are still “white-box” since they are transparent to the user. Decision trees are not limited to graph databases and the scikit-learn library in Python requires a data structure transformation from graph to a flat structure, e.g. pandas dataframe — so let’s ignore the graph model for now and jump directly to the point where we have a flat dataframe with selected numeric features.

Let’s assume we are working for a car seller and we want to build an expert system that tells the salesperson if a certain visitor is likely to purchase a car or not. Instead of developing rules from our own experience, we can use the historical sales data to derive the rules from previous purchase decisions. And this is something we should note when we are talking about decision trees: we are dealing with probabilities. Even though the tree would predict that a person is likely to buy a car, it does not mean that this person will really buy the car. We see a possible result of a decision tree classification for our car seller in the following figure.

Decision Tree Classification for Car Seller (Author)

Besides the graphical representation, we can also extract the new decision rules in a logical “if-then” format as described in this article.

if (Age <= 44.5) and (Annual Salary <= 90750.0) and (Annual Salary <= 69750.0) then class: No Purchase (proba: 100.0%) | based on 258 samples
if (Age <= 44.5) and (Annual Salary <= 90750.0) and (Annual Salary > 69750.0) then class: No Purchase (proba: 89.02%) | based on 173 samples
if (Age > 44.5) and (Age > 47.5) and (Annual Salary > 41750.0) then class: Purchase (proba: 82.78%) | based
on 151 samples
if (Age > 44.5) and (Age > 47.5) and (Annual Salary <= 41750.0) then class: Purchase (proba: 98.46%) | based on 65 samples
if (Age <= 44.5) and (Annual Salary > 90750.0) and (Annual Salary <= 119750.0) then class: Purchase (proba:
67.35%) | based on 49 samples
if (Age <= 44.5) and (Annual Salary > 90750.0) and (Annual Salary > 119750.0) then class: Purchase (proba: 97.67%) | based on 43 samples
if (Age > 44.5) and (Age <= 47.5) and (Annual Salary > 53250.0) then class: No Purchase (proba: 50.0%) | based on 34 samples
if (Age > 44.5) and (Age <= 47.5) and (Annual Salary <= 53250.0) then class: Purchase (proba: 85.19%) | based on 27 samples

So let’s assume we have validated this decision tree model and integrated it with a user interface, then the car salesperson could enter some personal visitor information to get a response which tells if this visitor is likely to buy a car or not. Let’s view following examples:

a) if a potential customer enters the shop with following attributes: female, age=28 and annual salary=78.000 — is this person likely to buy a car? → our decision tree says: no (classified with label “0”).

b) if a potential customer enters the shop with following attributes: male, age=28 and annual salary=118.000 — is this person likely to buy a car? → our decision tree says: yes(classified with label “1”).

We see that such a system would provide similar functionality like a traditional expert system for this car seller domain. The benefit is that this decision-tree-based expert system is able to learn from new data in order to improve the rules continuously. However, we need to keep in mind that this type of expert system works with probabilities, hence, the responses are typically not 100% accurate. With traditional expert systems, due to the application of logical inference, the responses are logically accurate (at least if the basic axioms are correct). But we need to consider the high manual efforts to build and maintain the knowledge base and rules, and we need also to consider that even experts can be wrong, which requires extra validation efforts. So in the end it’s a trade-off between maintenance efficiency and prediction quality.

Conclusions

After defining what ES are at all, we have seen that ES are still used, developed or at least referenced in research publications and the publication trend is even positive. From that perspective, we can clearly answer the initial question: are ES dead? No! We have discussed a few topical projects and their domains to see the bandwidth of ES. Finally, we have discussed some features of graph databases, represented by neo4j, to perform tasks that are typically associated with ES. These essential tasks are on the one hand the semantic storage of explicit knowledge and on the other hand the rule-based reasoning of new transitive knowledge. Of course, neo4j provides much more features in terms of data storage and analysis, but I think that the easy examples from above are comprehensible to understand the capabilities of neo4j as a knowledge base as part of an ES. In addition, we have briefly discussed how we can build rules from data via decision trees, which allows the development of self-learning expert systems.

So, when should we think about applying ES?

  1. in cases where we simply do not have data (yet) to employ machine learning models. Example: to develop and justify qualification plans for new product releases without reliable reference products.
  2. in cases where we are rule-driven and we know the rules already and the rules are static and not directed by data. Example: to infer business-defined categorical master data based on other detailed information, e.g. the particular structure and setup of a production machine.
  3. in cases where we have data and we want to use and understand decision rules and we are aware that rules may change over time. Instead of developing and maintaining rules manually, we can employ decision trees. Example: to understand and infer customer behaviors based on empirical data.

And to close this article with a fun fact: I asked ChatGPT and it does not consider itself as an expert system, because it is not trained for a certain domain in order to provide expert-level advise.

Sources

Bhattacharya, K., Gangopadhyay, S., DeBrule, C. (2021). Design of an Expert System for Decision Making in Complex Regulatory and Technology Implementation Projects. In: Chakrabarti, A., Poovaiah, R., Bokil, P., Kant, V. (eds) Design for Tomorrow — Volume 3. Smart Innovation, Systems and Technologies, vol 223. Springer, Singapore. https://doi.org/10.1007/978-981-16-0084-5_50

Chrimes, D. (2023). Using Decision Trees as an Expert System for Clinical Decision Support for COVID-19. Interact J Med Res 2023; 12:e42540
URL: https://www.i-jmr.org/2023/1/e42540. DOI: 10.2196/42540

Rak, Massimiliano & Salzillo, Giovanni & Granata,Daniele. (2022). ESSecA: An automated expert system for threat modelling and penetration testing for IoT ecosystems, Computers and Electrical Engineering, Volume 99,2022, 107721, ISSN 0045–7906, https://doi.org/10.1016/j.compeleceng.2022.107721 .

--

--

Professor of Quantitative Business | Ph.D. in Computer Science | 12 years of industry experience in Data Management, Analytics and Digital Manufacturing