Developing a Strategy Bot for an NGO with GPT-3

How to create your own custom version of GPT-3 for even better results

Heiko Hotz

Published in

Towards Data Science

6 min readNov 10, 2022

What is this about?

When OpenAI released GPT-3 in 2020, the Natural Language Processing (NLP) community went wild (similar to hype created in the past few months by text-to-image models like DALL-E 2 and Stable Diffusion). Within weeks people realised and harvested the potential to create amazing demos and applications with GPT-3, with astonishing results. In December 2021, OpenAI introduced the capability to fine-tune GPT-3, which means that customers can create their own custom version of the model tailored to their specific applications.

In this blog post we will learn how to create a custom version of GPT-3 and we will see how an NGO used this technology to create a Strategy Bot for young social entrepreneurs.

Why is this important?

Since the introduction of popular state-of-the-art NLP models like BERT, fine-tuning has been the primary mechanism with which these NLP models have been adapted for specific tasks. This technique leverages the concept of Transfer Learning: Using a pre-trained model and adapting it to a new (specialised) domain, which significantly improves model performance and reduces training cost.

Source: https://www.cse.ust.hk/~qyang/Docs/2009/tkde_transfer_learning.pdf

GPT-3 initially didn’t offer the option of fine-tuning. Instead it has been trained on a variety of tasks and specialised domains and it performed impressively well without further training. Over time, however, organisations started realising that the out-of-the-box model was impressive, but often not quite good enough to use in production.

The option of fine-tuning GPT-3 on a custom dataset and creating a custom version of it will push the model performance over the threshold at which organisation will be comfortable using it for their production workloads.

Problem statement

Founded in 1980, Ashoka is the world’s largest network of social entrepreneurs. These social entrepreneurs develop system changing solutions which usually start with a strategy paper, detailing the problem they’d like to solve, their approach, and how their solution could scale up for indirect and systemic impact. Developing these strategies can be a daunting task and Ashoka wanted to support the social entrepreneurs with a Strategy Bot, i.e. a text generation model that could help writing these strategies.

They tried the vanilla version of GPT-3 and found the results promising, but not quite good enough for their specific purpose. They knew about the option of fine-tuning GPT-3 and needed someone who could train the model on their dataset, which is where I come into the picture 🙂 So, let’s have a look at how we could go about this!

Solution walk-through

Ashoka had over 4,000 examples of previous strategy papers available, which, with a bit of feature engineering, were a great training dataset for GPT-3. Before embarking on the fine-tuning journey we also wanted to get a sense of how much it would cost. Finally we had to prepare the dataset in a specific way before kicking off the training. Let’s go through it step by step.

Data preparation

The dataset Ashoka had available consisted of several columns with text of interest and our goal was to distil those down two just two columns: one for the prompt text and one for the text we want GPT-3 to (ideally) generate from the prompt. This is according to OpenAI’s guidelines how the data needs to be prepared for fine-tuning:

Source: https://beta.openai.com/docs/guides/fine-tuning

For the prompt text we could use a column in the dataset that was called Introduction. Some of those were still too long, though, so we decided to just take the first two sentences of that text as the prompt. In addition we found that GPT-3 performed even better if we appended a short instruction at the end of the prompt:

The result of this exercise would then look like this:

Similarly we would compile texts from different columns into the completion feature to give GPT-3 an idea of what we want it to generate. To learn more about what those strategies entail you can check out https://www.ashoka.org/en-us/story/strategy-plan-action.

It’s important to note that the completion text should have a unique phrase to indicate the end of the text generation (more information about this on OpenAI’s website). Thus, we ended up with this code to create the completion:

Cost estimation

Now that we have the data ready to go we could submit it to create the training job. Before we do that, though, we want to make quick calculation on how much the training will cost — nothing worse than training a model and only realising afterwards that it cost way more than expected. (It would actually be great if OpenAI would provide a feature that estimates the price of a fine-tuning job beforehand, but to my knowledge this did not exist at the time of writing)

Luckily OpenAI’s pricing website gives us some clues on how to calculate the price: Training the most capable model (Davinci) costs 3 US-cents per 1,000 tokens. And the website also states that a tokens are wordpieces and 1 token equates roughly to 4 characters:

Finally, OpenAI also provides some guidance on how many examples we should train the model on:

We decided to go with 500 examples and therefore our price estimation was that the training would cost us roughly USD 10, quite affordable 🤗

Fine-tuning

We have prepared the data, we have a good idea what the fine-tuning will cost us, so now is the time to pull the trigger. OpenAI’s API reference is very clear and easy to follow — to fine-tune a model all we need to do is to upload the training file and call the FineTune.create API:

Once the training job is submitted it took about two hours to complete, but this obviously will depend on your training data and your mileage may vary. Once the training job is completed the new model will be available in the Playground app:

Now we can finally test our very own GPT-3 model 😃

Testing

There is not much we can automate when it comes to testing these models — unfortunately there are still no good benchmarks out there (to my knowledge) that assess whether a a generated text is “good”. So we did our own manual testing, and we were very surprised on how good the model performs.

The strategies that the model created from a simple idea included the indirect and systemic impact, and the it also included all kinds of clever tactics around open sourcing and training other organisations. This was exactly what we needed 🎉

Conclusion

In this blog post we have seen how easy (and quite affordable) it is to create our very own GPT-3 model. We have seen that the results of a fine-tuned model exceed our expectations and were good enough to use in a production workload. Ashoka is currently testing and implementing this model for internal use and will be rolling it out for social entrepreneurs to help them create new strategies to have change the world for better 🌍

If you’re interested in connecting, please reach out via LinkedIn or via https://www.aiml.consulting/