An Overview of Natural Language Processing Applications

Practical approaches to processing text with deep learning, explained

Jacob Solawetz
Towards Data Science

--

Natural language processing (NLP) is a branch of artificial intelligence that encompasses a wide area of software designed to reason about and act on text data. NLP technology is rapidly advancing and it can be difficult to sort out which NLP techniques are making the most impact in industry.

In this post, we will introduce and discuss the natural language processing techniques that have become widely adopted and used in real life applications.

Let’s dive in.

A Brief History of Natural Language Processing

The field of natural language processing traces its roots back all the way to Alan Turing and the Turing test — can a computer be taught to dialogue in natural language so effectively that it would be mistaken for a human being? As such, NLP (along with computer vision) is at the core of modern artificial intelligence research.

Rules Based NLP

NLP has undergone two distinct phases. The first is commonly referred to as “good old-fashioned AI” where algorithms were written to parse language using logical rules. These techniques led to an array of NLP techniques in the 80’s, 90’s, and 2000’s including knowledge bases, rules-based inference machines, and syntax parsers.

Machine Learning

Recently, the introduction of machine learning techniques in NLP has led to a Cambrian explosion of new techniques where a large training corpus is used to supervise a deep learning model to create a flexible algorithm around language. This article will be focusing on machine learning approaches in NLP including:

  • Text Classification
  • Slot Extraction
  • Dialogue Systems
  • Text Translation

It is possible to build custom NLP models that do not fall under these more well established techniques, but it is always useful to consider if your task might fit well into one of these categories, or an ensemble of these NLP techniques , as this will allow you to leverage a host of open source work for your project.

Text Classification

Many tasks in natural language processing boil down to the task of classifying text. In text classification the machine learning algorithm is given a choice of categories that it may place a given piece of text.

Text Classification Example

An example text classification prediction

For example, if we were training a bot to triage user request for at a travel agency, we might have the bot place the phrase “Please book me a hotel stay for two nights in Taipei” into the category “New Booking”. The text classification algorithm will be tasked with deciding that is classification is superior over other possible categories.

Text Classification Training

To train a text classifier, you will need to assemble a dataset of similar examples to show your machine learning algorithm. This dataset will be passed through a training loop, informing the model iteratively of the decisions that it should be making.

Text Classification Repositories

Some of our favorite repositories for text classification include:

  • ktrain — a high level implementation of text transformers for classification.
  • fastai.text — fastai has nice implementations of various text classifiers.

Slot Extraction

Another large subset of natural language processing problems fall under the jurisdiction of slot extraction (also known as information extraction or slot filling). Slot extraction looks at a sentence and extracts relevant bits of information, passing them into a slot category.

Slot Extraction Example

An example slot extraction prediction

For example, in our travel agent bot example, we may need additional information than text classification has provided, and we may want to train a slot extraction model to pull out the relevant details to complete the user’s request, rather than having to algorithmically ask for additional information.

Slot Extraction Training

To train a slot extraction model, you will need to gather a training dataset, as with classification. In this dataset, the spans in the sentence that you want to be extracted should be highlighted and labeled with their specific slot labels.

Slot Extraction Repositories

Some of our favorite slot extraction repositories include:

  • JointBert — a really nice implementation of using BERT for slot extraction
  • LSOIE — an open information extraction repo for open slot extraction

Dialogue Systems

Dialogue systems lie at the intersection of classification and slot filling. In dialogue systems, user input is grouped into “intents” with a classification model and then subsequently, surveyed with a slot filling model to look for relevant information to the users request. Dialogue systems can carry across multiple turns of a conversation with the same model.

Dialogue System Resources

You can work on building your own dialogue system with open source technology like JointBert or by using a chat bot service such as Rasa, or Google’s DialogueFlow.

Text Translation

Text translation is another area where NLP technologies have made a significant impact. Text translation takes text input in one language and sequentially predicts the translated sequence in the target language.

An example of text translation

In our hotel booking example, we may want translate text to a given target language, say Spanish.

Unlike text classification and slot filling, text translation has less of an element of customizability and generally you will be using an off the shelf model that has already been trained to translate between the languages of your choosing.

Conclusion

It is an exciting time to be working on project involving natural language processing as NLP technologies powered by machine learning are making rapid advancement and real impact in industry.

In this post, we have reviewed some of the key techniques that are commonly utilized in natural language processing including text classification, slot extraction, dialogue systems, and text translation.

As always, happy training!

Originally published at https://blog.apexadvisors.ai on January 6, 2021.

--

--

Machine Learning @ Roboflow — building tools and artifacts like this one to help practitioners solve computer vision.