The world’s leading publication for data science, AI, and ML professionals.

Transformers: Implementing NLP Models in 3 Lines of Code

An introduction to the transformers library for implementing state-of-the-art models for different NLP tasks

Figure 1. Transformers | Image by author
Figure 1. Transformers | Image by author

Using state-of-the-art Natural Language Processing models has never been easier. Hugging Face [1] has developed a powerful library called transformers which allows us to implement and make use of a wide variety of state-of-the-art NLP models in a very simple way. In this blog, we are going to see how to install and use the transformers library for different tasks such as:

  • Text Classification
  • Question-Answering
  • Masked Language Modeling
  • Text Generation
  • Named Entity Recognition
  • Text Summarization
  • Translation

So before we start reviewing each of the implementations for the different tasks, let’s install the transformers library. In my case, I am working on macOS, when trying to install directly with pip I got an error which I solved by previously installing the Rust compiler as follows:

$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

After that I installed transformers directly with pip as follows:

$ pip install transformers

Great, with the two previous steps, the library would have been installed correctly. So let’s start with the different implementations, let’s go for it!

Text Classification

The text classification task consists of assigning a given text to a specific class from a given set of classes. Sentiment analysis is the most commonly addressed problem in a text classification problem.

To use a text classification model through the transformers library, we only need two arguments, task and model , which specifies the type of problem to be approached and the model to be used respectively. Given the great diversity of models hosted in the Hugging Face repository, we can start playing with some of them. Here you can find the set of models for text classification tasks.

In Figure 2 we can see the implementation of the bert-base-multilingual-uncasced-sentimentmodel for sentiment analysis.

Figure 2. Text Classification | Image by author
Figure 2. Text Classification | Image by author

The output is:

Result: [{'label': 'NEG', 'score': 0.9566874504089355}]

Depending on the model that you choose to implement, it will be the result to be obtained. It is important to consider reading the documentation of each model to know what datasets they were trained on and what type of classification they perform. Another great advantage of transformers is that if you have your own model hosted in the Hugging Face repository, you can also use it through this library.

Question-Answering

The task of Extractive Question Answering is about trying to find an answer given a question in a given context. One of the most representative datasets for this task is The Stanford Question Answering Dataset (SQuAD) [2].

Tackling this task. the transformers pipeline requires context and question . In the following example the context is determined by a paragraph from the Alice in Wonderland book [3], the question refers to an event described in the paragraph. In the following figure you can see how the implementation would be:

Figure 2. Question-Answering | Image by author
Figure 2. Question-Answering | Image by author

The output is:

Answer: 'her sister'

For this task, we select the model robert-base-squad-v1 , however, in the Hugging Face repository we can find different alternative models for this task, it would be worth taking a look at some of them.

Masked Language Modeling

The Masked Language Modeling task is about masking tokens of a given text sentence with a masking token, where the model is asked to fill each mask with an appropriate token.

For a task of this type, the transformers pipeline only requires the name of the task (in this case it is fill-mask ) and then the text sequence where the token to be masked is specified, in the following figure we can see the implementation:

Figure 3. Masked Language Modeling | Image by author
Figure 3. Masked Language Modeling | Image by author

The output is:

[{'sequence': ' Horror movies are often very scary to people',
  'score': 0.12314373254776001,
  'token': 28719,
  'token_str': ' Horror'},
 {'sequence': ' horror movies are often very scary to people',
  'score': 0.052469268441200256,
  'token': 8444,
  'token_str': ' horror'},
 {'sequence': 'Ghost movies are often very scary to people',
  'score': 0.05243474990129471,
  'token': 38856,
  'token_str': 'Ghost'},
 {'sequence': 'War movies are often very scary to people',
  'score': 0.03345327079296112,
  'token': 20096,
  'token_str': 'War'},
 {'sequence': 'Action movies are often very scary to people',
  'score': 0.029487883672118187,
  'token': 36082,
  'token_str': 'Action'}]

The result is displayed as a list of tokens and their respective properties. In this case, the token with the best score is Horror and the last token is Action .

Text Generation

The text generation task refers to the creation of a syntactically and semantically correct portion of text with respect to a determined context. In this case, the pipeline initialization requires the type of task and the model to be used, as in the previous tasks. Finally, the pipeline instance requires 2 parameters, the context (or seed) and the length of the sequence to be generated max_length . The number of sequences to generate is an optional parameter.

The following figure shows the implementation of the GPT-2 model for the generation of 5 text sequences:

Figure 4. Text Generation | Image by author
Figure 4. Text Generation | Image by author

The output is:

[{'generated_text': 'My name is Fernando, I am from Mexico and live for a reason. I am a musician and the best producer, you might call me a poet'},
 {'generated_text': 'My name is Fernando, I am from Mexico and I make an app with a lot of friends to keep us safe!" said Fernando.nnThe'},
 {'generated_text': 'My name is Fernando, I am from Mexico and I am an atheist. I am living in a town called Tanta and I am living in the'},
 {'generated_text': 'My name is Fernando, I am from Mexico and I have been doing this gig since the age of 21 and I am the first person to record this'},
 {'generated_text': 'My name is Fernando, I am from Mexico and I am in Mexico", he said.nnHis name may be a reference to his birthplace and'}]

A little funny the sequences generated by GPT-2 based on someone who lives in Mexico and his name is Fernando.

Named Entity Recognition

The Named Entity Recognition task refers to the assignment of a class to each token of a given text sequence. For the implementation of this task, it is only necessary to assign the task identifier ner to the pipeline initialization. Subsequently, the object receives only one text stream. In the following figure we can see the implementation:

Figure 5. Named Entity Recognition | Image by author
Figure 5. Named Entity Recognition | Image by author

The output is:

('Fernando', 'I-PER')
('Mexico', 'I-LOC')
('Learning', 'I-ORG')
('Engineer', 'I-MISC')
('Hit', 'I-ORG')
('##ch', 'I-ORG')

For this example, the classes are:

  • I-MISC, Miscellaneous entity
  • I-PER, Person’s name
  • I-ORG, Organisation
  • I-LOC, Location

It is interesting to see that the company was assigned correctly as an organization.

Text Summarization

The Text Summarization task refers to the extraction of a summary given a determined text. To initialize the pipeline , the definition of the task is required as well as the summarization identifier. Subsequently, for the implementation of the task, only the text and the maximum and minimum sequence length to be generated are required as an argument. In the following figure we can see the implementation of this task:

Figure 6. Summarization | Image by author
Figure 6. Summarization | Image by author

The output is:

[{'summary_text': ' Machine learning is an important component of the growing field of data science . Machine learning, Deep Learning, and neural networks are all sub-fields of artificial intelligence . As big data continues to grow, the market demand for data scientists will increase, requiring them to assist in the identification of the most relevant business questions .'}]

As we can see, the summary generated by the model is correct with respect to the input text. In the same way as with the previous tasks, we can play with various models for text summarization such as BART , DistilBart and Pegasus [4].

Translation

The translation task refers to the conversion of a text written in a given language to another language. The transformers library allows to use state of the art models for translation such as T5 in a very simple way. The pipeline is initialized with the identifier of the task to be solved which refers to the original language and the language to be translated, for example to translate from English to French the identifier is: translation_en_to_fr . Finally, the generated object receives the text to be translated as an argument. In the following figure we can see the implementation of the translator of a text from English to French:

Figure 7. Translation | Image by author
Figure 7. Translation | Image by author

The output is:

L'apprentissage automatique est une branche de l'intelligence artificielle (AI) et de la science informatique qui se concentre sur l'utilisation de données et d'algorithmes pour imiter la façon dont les humains apprennent, en améliorant progressivement sa précision.

Conclusion

Throughout this tutorial blog we saw how to use the transformers library to implement state-of-the-art NLP models in a very simple way.

In this blog we saw how to implement some of the most common tasks, however it is important to mention that the examples shown in this blog are merely for inference, however one of the great attributes of the transformers library is that it provides the methods to be able to fine tune our own models based on those already trained. Which would be a good topic for the next blog.

References

[1] Hugging Face

[2] SQuAD: The Stanford Question Answering Dataset

[3] Alice in Wonderland

[4] Text Summarization Models


Related Articles