Photo by @allecgomes on Unsplash

A Short History of Natural-Language Understanding

A subtopic of natural-language processing in artificial intelligence

Alex Moltzau
Towards Data Science
8 min readAug 2, 2020

--

When exploring natural language processing I stumbled upon the term natural-language understanding. It made me think more about language not only in terms of processing, but meaning.

What does a yellow flower mean?

Of course, it can simply signify a yellow flower, as an object.

It can be the favourite flower of the person you love, and remind you of that wedding day, a cherished memory or sad depending on circumstance.

Maybe you always bring yellow flowers to your grandmother’s grave.

In Mexico a yellow flower, the marigold, can symbolise death. Día de Muertos celebrations in Mexico have been characterised by the vibrant yellow and orange hues of the marigold flower.

Photo by @albrb

A yellow flower could stand for enlightenment and this is why it is used extensively to represent Buddha and Lord Vishnu in Indian scriptures.

In France, yellow flowers portray jealousy.

Does it mean the same over time?

Yellow flowers in Victorian England were used to symbolise unrequited love. Victorians used flowers a lot like we use emoji.

Talking of emoji — this one looks innocent.

🍆

It is most often used online to represent a penis.

When next to the sweat droplets emoji, it means ejaculation.

We went from flowers to ejaculation a bit quick there.

Meaning and context in language whether spoken, text or symbol is not that easy.

What is natural-language understanding?

If meaning is so difficult it seems challenging to turn this meaning into an automatic process.

Within this article I explore the term natural-language understanding, mostly starting from the Wikipedia article and further elaborating through a few other sources I have found interesting.

“Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension.”

In this manner it deals with something quite difficult and complex.

If operating within the field of artificial intelligence (AI) it can be considered an AI-hard problem. What is understanding?

“Understanding is a psychological process related to an abstract or physical object, such as a person, situation, or message whereby one is able to think about it and use concepts to deal adequately with that object. Understanding is a relation between the knower and an object of understanding.”

Can understanding be automated?

Photo by @heathermount

Can this process be described sufficiently to replicate it or use artificial language (programming/code) to better understand it?

If the answer is ‘yes’ to either of these questions you may have some interest in natural-language understanding.

However, these questions may not be relevant and you may simply want to explore it from a natural-language perspective — humans talking, texting or communicating.

If you ever learnt a different language than your own you may to some degree understand that sentences can say the same and mean different things, and that there are social contexts these sentences mean different things.

Language is social and dynamic, it is changing, not all static.

Insight can be applied to automate reasoning, machine translation, question answering, news gathering, text categorisation, voice-activation, archiving and large-scale content analysis.

The history of natural-language understanding

There are likely many ways to tell the history of natural-language understanding, yet if we go by what you can read on Wikipedia it starts with the program STUDENT.

1964 STUDENT was written by Daniel Bobrow for his PhD dissertation at MIT.

STUDENT used a rule-based system with logic inference.

The rules were pre-programmed by the software developer and were able to parse natural language.

It is known as one of the earliest attempts at natural-language understanding by a computer.

Prior to this John McCarthy had coined the term artificial intelligence in 1955.

Daniel Bobrow’s dissertation was titled Natural Language Input for a Computer Problem Solving System.

In fact, this publication can be found on ResearchGate.

Screenshot of Daniel Bobrow’s dissertation titled Natural Language Input for a Computer Problem Solving System

It showed how a computer could understand simple natural language input to solve algebra word problems.

1965, one year later Joseph Weizenbaum at MIT wrote ELIZA.

ELIZA was an interactive program that carried on a dialogue in English on any topic, the most popular being psychotherapy.

A conversation with the ELIZA chatbot. Rights: Public domain.

The creator regarded the program as a method to show the superficiality of communication between man and machine.

Yet, to his surprise a number of individuals attributed human-like feelings to the computer program, including his secretary.

ELIZA worked by simple parsing and substitution of key words into canned phrases.

ELIZA gained surprising popularity as a toy project.

Still, it can be seen as an early example of commercial systems in 2020.

1969 Roger Schank at Stanford University introduced the conceptual dependency theory for natural-language understanding.

The model uses the following basic representational tokens:

  • real world objects, each with some attributes.
  • real world actions, each with attributes
  • times
  • locations

A set of conceptual transitions then act on this representation.

How do you make meaning independent of the words used in the input?

1970, William A. Woods introduced the augmented transition network (ATN) to represent natural language input.

These networks called a set of ‘finite state automata’.

The behavior of state machines can be observed in many devices in modern society that perform a predetermined sequence of actions depending on a sequence of events with which they are presented.

This is a contrast to phase structure rules. Phase structure rules are used to break down a natural language sentence into its constituent parts, also known as ‘syntactic categories’, including both lexical categories (parts of speech) and phrasal categories

Finite state automata were called recursively.

“A finite automaton can be seen as a program with only a finite amount of memory. A recursive automaton is like a program which can use recursion (calling procedures recursively), but again over a finite amount of memory in its variable space.” [Recursive automata]

1971 Terry Winograd finished writing SHRDLU for his PhD thesis at MIT.

This program could understand simple English sentences in a restricted world of children’s blocks to direct a robotic arm to move items.

Original screen display posted by Stanford HCI.

This successful demonstration provided significant momentum for continued research in the field.

Winograd published his book Language as a Cognitive Process.

“This book is probably the first ever comprehensive, authoritative, and principled description of the intellectual history of natural language processing with the help of computers.” [book review]

What makes this story even more interesting is the person Winograd would later advise.

Larry Page, who co-founded Google, had Terry Winograd as his adviser.

1970s and 1980s the natural language processing group at SRI International continued research and development in the field.

“SRI International (SRI) is an American nonprofit scientific research institute and organization headquartered in Menlo Park, California. The trustees of Stanford University established SRI in 1946 as a center of innovation to support economic development in the region. The organization was founded as the Stanford Research Institute. SRI formally separated from Stanford University in 1970 and became known as SRI International in 1977.”

1982, additionally Gary Hendrix formed Symantec Corporation originally as a company for developing a natural language interface for database queries on personal computers.

However, Symantec changed direction.

1983, Michael Dyer developed the BORIS system at Yale which bore similarities to the work of Roger Schank and W. G. Lehnert.

In the decades between the 1980’s and the 2010’s it gets a bit more vague in regards to progress, it is still a time period I need to learn more about. At least in regards to natural language understanding (NLU). In regards to NLP more in general there was certainly progress.

1980s, there was a revolution in NLP with the introduction of machine learning algorithms for language processing.

There had been a steady increase in computational power and a gradual lessening of the dominance of Chomskyan theories of linguistics (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing.

“Chomsky developed a formal theory of grammar in which transformations manipulated not only the surface strings but also the parse tree associated with them, making transformational grammar a system of tree automata.”

The earliest-used machine learning algorithms, such as decision trees, produced systems of hard if-then rules similar to existing hand-written rules.

Statistical models has received more focus recently, which make soft, probabilistic decisions based on attaching real-valued weights to the features making up the input data.

On the ‘history of natural language processing’ page on Wikipedia there is a list of recent software. It stretches back to the 1954 Georgetown experiment. However, I have chosen to show the 1980’s to the 2010’s.

Not all of these systems are focused on Natural Language Understanding.

In closer decades we have seen the rise of other systems, such as IBM Watson.

2011, the Watson computer system competed on Jeopardy! against champions Brad Rutter and Ken Jennings.

On the one hand, this is interesting progress, on the other hand it is debated how much “understanding” such systems demonstrate, e.g. according to John Searle, Watson did not even understand the questions.

John Ball, cognitive scientist and inventor of Patom Theory supports this assessment.

As can be noticed from voice assistant in recent times such as Siri, Amazon (Alexa) and Google (Nest) these do not always understand you. It must be said that it is closer, and people are now having conversations with devices.

100s of millions of people are talking to boxes or smartphones, and these devices can to some extent understand you.

It will certainly be interesting to see how we can understand or interpret language in the years ahead. It is 2020, and we are moving closer to understanding how humans communicate. We are closer as well to understanding how to make devices communicate in a way that humans can comprehend these and have a dialogues.

I hope you found this article enjoyable and that it made you more interested in natural-language understanding.

The moment we understand this world it is bound to change somewhere. Although we, at any time may only have a limited understanding it is certainly interesting to create a map and see where it leads.

It could probably lead to a few misunderstandings.

Can a machine be programmed to understand you?

Photo by @kristapsungurs

This is #500daysofAI and you are reading article 425. I am writing one new article about or related to artificial intelligence every day for 500 days.

--

--