How to Use GPT-J for (Almost) Any NLP Task

Prompt engineering and how it can be used with text generation models

Heiko Hotz

Published in

Towards Data Science

5 min readMay 6, 2022

What is this about?

In a previous blog post we had a look at how we can set up our very own GPT-J Playground using Streamlit, Hugging Face, and Amazon SageMaker. With this playground we can now start experimenting with the model and generate some text, which is a lot of fun. But eventually we want the model to actually perform NLP tasks like translation, classification, and many more. In this blog post we will have a look how we can achieve that using different parameters and particular prompts for the GPT-J model.

This blog post will build on this previous blog post and this Github repo and it is assumed that you have already built your own GPT-J playground. The code for this blog post can be found in a separate branch in the same Github repo.

Adding parameter controls

Before we get started with prompt engineering we should add some parameters with which we can control the model’s behaviour to a certain extent. A great blog post that describes many of those parameters in detail can be found here. In this tutorial we will introduce just three of them: response length, temperature, and repetition penalty. Response length is pretty self-explanatory — it controls how long the model’s response will be. Temperature is a measure of creativity of the model — the higher the temperature the more ‘creative’ the model’s response will be. Repetition penalty again is quite self-explanatory — the higher this value is the more the model will be penalised for repeating words.

Let’s create a sidebar in our Streamlit app that will allow us to control the values for these parameters in the UI:

Now these controls will be visible in a sidebar of the UI:

We also need to make sure that we pass the values for these parameters to the model by incorporating them into the payload when calling the model endpoint:

Now that we have these parameters added to our application we can start experimenting with different prompts.

Prompt engineering

Let’s start by having a look at the Wikipedia definition of prompt engineering:

Prompt engineering is a concept in artificial intelligence, particularly natural language processing (NLP). In prompt engineering, the description of the task is embedded in the input, e.g., as a question instead of it being implicitly given.

This means we can create some text as input for the model that will ‘tell’ the model which task to perform. Let’s look at some examples.

Classification

Let’s start with a relatively “simple” task, text classification:

In this example we state the task explicitly (match food to countries) and also provide a few examples. This is called few-shot learning, because the model is able to learn what the task is from these examples (in addition to the task statement).

We chose a low temperature and a low repetition penalty because we don’t want the model to be too creative for this task. We also kept the response length at medium and the model therefore keeps carrying on with the list. Most of the time this will be interesting but undesirable because we want the model to classify one specific text (“Fish & chips”). This can be remedied with a bit of post-processing by cutting off everything after the line break.

Translation

Text generation models also perform well for translation tasks, provided the model has been trained multi-lingual. Let’s omit the explicit task statement and just provide a few examples:

First off, for those not familiar with German, this is a correct translation (one of several possible correct translations, to be precise). So, the model is even able to identify the task without explicitly stating it. Again we kept the values for temperature and repetition penalty rather low. Choosing a higher value for temperature, for example, makes the model to be a bit too creative. A value of 0.6 once created the following translation: “Um wie viel Uhr macht es Bier?” (“At what time does it make beer?”). Then again, this might be a common question in German ;)

SQL Generation

Let’s make things a bit more difficult and not provide any examples. Instead we just tell the model directly what to do:

This techniques is called zero-shot learning, because we provide no examples and still expect the model to understand the task. In this case it actually worked out quite well, and, in general, text generation models perform surprisingly well without any examples to learn from.

We, again, chose low values for temperature and repetition penalty because SQL queries have a fixed structure and therefore we don’t want the model to be too creative.

Free text generation

You might have noticed that so far we didn’t use the parameter controls all that much: Both temperature and repetition penalty have been kept on the low side. This is because we want to model to complete a specific task, so we don’t actually want it to be too creative. Let’s try a different use case where we want the model to be a bit more creative.

In this case we want to increase the temperature at least to medium as we want the model to be more creative. When doing that we also often find that the model uses the same words quite often, so we increase the repetition penalty as well. Depending on the kind of text we want the model to produce these values could be increased even further.

Conclusion

In this tutorial we started experimenting with various prompts do accomplish NLP tasks such as classification, translation, etc. We have, however, only scratched the surface so far: There are many more tasks we could experiment with, such as summarisation, generating email responses, etc. There are also many more parameters we could include in the app to control the models behaviour, such as Top-P, beam search, etc.

I find that it’s a lot of fun to experiment with these text generation models and I hope you can now do the same after following this blog post and setting up this GPT-J playground.