The world’s leading publication for data science, AI, and ML professionals.

Web-Based Chatbot Project, Module 2: GPT-3-generated responses assisted with a database for…

Second part of this project spanning web programming, math, AI, and more

I recently introduced this web-based, entirely client-side chatbot that has two main modules in its "brain":

  • One that retrieves answers from a database of question-answer pairs by using approximate string matching.
  • Another that uses GPT-3 to generate replies from pieces of text parsed internally.

And I showed you in detail how the first module works together with several details about the string matching functions, database compilation, etc.:

Web-Based Database-Powered Chatbot Project – Module 1: Approximate String Matching

Well now, here’s the other big half of this chatbot’s brain: a module that generate answers to the user’s questions through OpenAI’s GPT-3 language model extended through few-shot learning from texts that are first retrieved through string matches.

Key takeaway: The chatbot’s GPT-3 module generate answers to the user’s questions through OpenAI’s GPT-3 language model extended through few-shot learning from texts that are first retrieved through string matches.

At the core: GPT-3-based natural language processing and generation informed with custom texts

Quick recap of GPT-3

The core of this half of the chatbot’s brain is GPT-3, a model developed by OpenAI for text processing and generation. I have introduced and tested GPT-3 quite extensibly, especially regarding its capacity to accurately reply factual questions and to be informed with custom data:

Devising tests to measure GPT-3’s knowledge of the basic sciences

GPT-3-like models with extended training could be the future 24/7 tutors for biology students

A Note on GPT-3 and Its (Obviously Null) "Thinking" Capabilities

Why GPT-3 in this project?

GPT-3 is today one of the largest models but not truly THE largest, and possibly not the best either. But it is there in the top 3-5, and to me it is by far the easiest to utilize. Why? Because you simply make calls to the API that OpenAI made available; and for web apps you can do this through simple PHP:

Build custom-informed GPT-3-based chatbots for your website with very simple code

By the way, I run PHP in my free Altervista-powered website. This provider offers rich features (I don’t gain anything for promoting it… I just like it!):

To me Simply the Best Free Web Hosting out there -Honest Review, not Paid Advertisement!

The other big reason why I prefer GPT-3 over alternatives, is that it is very easy to extend it with information that it can then retrieve naturally. You can do this in two ways:

  • Fine-tuning, which involves training small pieces of model with custom information. I could never make this work well…
  • Few-shot learning, which involves preceding the prompt with text that contains the custom information. This works very well, and is very straightforward to implement.

In the next subsection I describe few-shot learning in GPT-3, and I then show you how I use it in my chatbot -as there is an important limitation to overcome, you’ll see.

Few-shot learning to inform GPT-3 with specific knowledge

One of the coolest features of GPT-3 is that you can easily simulate that you teach it without actually retraining it. This is called "few-shot learning" and consists in providing a piece of text containing the information of relevance to the prompt, before the prompt.

This is especially useful when you want to ask GPT-3 about a subject that it doesn’t know about… Say, for example, about a person like you or me, who are not famous. Then you could preced your question with a curriculum vitae of the person, or a few paragraphs about him/her; when asked, GPT-3 will process this information as part of the prompt and then elaborate an output text that will be influenced by it if the prompt is asking information that is in there.

I have used this exact strategy based on few-shot learning to guide GPT-3’s answering of questions about biology. When the questions are simple, it can reply correctly without any extra learning; but as the topic gets more specialized, it starts going wrong or not producing any answers. If you preced the questions with texts, even complex paper abstracts, it can then retrieve the answers very well, even if not asked exactly the way how they are encoded in the text. You can see some examples of this in action in the following article:

GPT-3-like models with extended training could be the future 24/7 tutors for biology students

Let’s now see how this is all wrapped with string matching to produce content for few-shot learning in the chabot.

Topic-specific few-shot learning in the GPT-3 module of this web chatbot

One important limitation of the few-shot learning strategy is that the prompt is limited in the number of tokens it can accept. The exact limit is either 2000 or 4000 tokens, depending on the exact GPT-3 engine and endpoint. This limit in token includes both the input, where the information for few-shot learning plus questions are provided to the model, and the output of the model, that is the text generated by GPT-3.

OpenAI’s official documentation explains that 2000 tokens are around 8000 characters, or 1500 words. This means that if we expect the question plus answer to take up around 100–200 words, then we are left with 1300–1400 words to provide information for few-shot learning. That is a decent amount, but there is no way we can fit large corpuses of text, not even short articles. Thus, we cannot provide say a whole textbook about a given topic, or in my case the whole content of my papers and blog posts, for few-shot learning.

So this is the procedure I built into my chatbot:

  1. Build a database containing abstracts of my papers and pieces of my blog entries, each of no more than 300 words, in separated variables linked to URLs with full articles and their titles.
  2. Clean up the user’s input for numbers, symbols, stopwords, etc.
  3. Search for stretches of 1, 2, or 3 consecutive words from the cleaned-up text inside the titles and abstracts of the database. When there is a match, save the corresponding text.
  4. Merge together all the pieces of saved text, trim it to no more than 5000 tokens, add the user’s question at the end, and submit the resulting input to GPT-3. This input contains the information for few-shot learning retrieved from the articles with matching keywords, followed by the user’s question.
  5. On return from GPT-3, remove the input to keep only the portion of text that GPT-3 added, presumably containing the answer to the question.

A deeper dive into the procedure

A few specific points about how this module works.

First, I keep the texts in an array of this form:

["Title","Author(s)","Journal, magazine or blog",Year,"URL","Text"]

Where the last field, "Text", is a long string used to build up the prompt for few-shot learning by GPT-3. For scientific texts this consists of simply the abstract of the publication or preprints, which are all openly available at for example PubMed. For magazine articles and blog entries this text is either a summary of the article or its first few paragraphs.

The following are examples taken from my chatbot program in its current version:

From one of my peer-reviewed papers:

texts.push(["Democratizing interactive, immersive experiences for science education with WebXR","F Cortes Rodriguez, M Dal Peraro and LA Abriata","Nature Computational Science", 2021, "https://www.nature.com/articles/s43588-021-00142-8","Objects and concepts of abstract nature, extreme sizes, poor availability or high purchase costs are often hard to understand and experiment with. We humans use […] entirely virtual, as in virtual reality (VR), or an extended view of the real world, as in augmented reality (AR)."])

From one of my blog entries:

texts.push(["Basil, strawberries, salads and tomatoes growing powerful in the orchard balcony. Plus… what’s that? Watermelons? Zucchinis?","L Abriata","Gardening, Birding, and Outdoor Adventure (Medium)",2022,"https://medium.com/gardening-birding-and-outdoor-adventure/basil-strawberries-salads-and-tomatoes-growing-powerful-in-the-orchard-balcony-d6fad7ee6e73","Here you see one of them on the rack I built to raise the plants and make them reach sunlight. With some corn salad in front, too. These are basil plants […] Strawberries have been living in my balcony for years almost wild, but this time I put them in more sunbathed places. A colorful setting thanks to these snapdragons."])

Second, cleaning the input string is essential, so that the string search can focus on the meaningful words. You can check my previous article to see the lists of word fixes, stopwords, symbols, etc. incorporated into my string cleaning function.

Note that as soon as some articles match, the chatbot displays a link to a list of them. This is always useful but especially when the user doesn’t have a GPT-3 API key, in which case the chatbot won’t be able to reply based on the text contents -but the user can at least go see the articles him/herself.

Third, the chatbot checks whether an API key for GPT-3 is being provided by the user or not. If none is provided, it doesn’t even attempt to run this module. If it is provided, it informs the user that it’s going to try to use it, making it clear that it might take a minute or so to get an answer.

This last point is important because the API call works asynchronously, so it could possibly happen that the user will type something before the chatbot returns a reply from the GPT-3 module. To make the conversation more natural, when the chatbot is going to present the outcome of the GPT-3 procedure it first recalls the user that "regarding the question about … then …"

You can see this and other points discussed hitherto in the examples that follow.

Example conversations with the final web chatbot

Here you have a conversation that combines replies based on queries through string matching (presented in my previous post) and through calls to GPT-3:

A chat session that used the two modules of this web-based chatbot, as analyzed below. Screenshot by author.
A chat session that used the two modules of this web-based chatbot, as analyzed below. Screenshot by author.

Here I transcribed it for easier reading, with comments in-between in plain font format:

Bot: Hello visitor of Luciano’s site! I’m a chatbot trained with information about Luciano, his jobs, projects, and hobbies. To chat with me, type your text above and hit enter. To know what I can help you with, type ‘Help’ or ask something like ‘What can you do for me?’

That was the default sentence that the chatbot displays on load.

Visitor: hi

Bot: Hello, all good?

Visitor: yep

Bot: That’s great.

That piece of chat comes from the very rich database I built based on Microsoft’s open dataset for chatbot personalities, as explained in my previous post.

Visitor: i’m here to ask about molecularweb

Bot: moleculARweb is a website for education about chemistry and structural biology using commodity augmented reality that runs on regular smartphones, tablets and computers. To use the basic activities in the website you need to print some markers, enter the website at https://molecularweb.epfl.ch, chose an activity and show the markers to the webcam.

This is still a reply from the database. And it continues below, where "it" is internally replaced by "molecularweb" (the last topic, all this is in my previous article) which allows an exact match to a question of the database.

Visitor: how do i use it?

Bot: You can use moleculARweb at https://molecularweb.epfl.ch. You will have to print some markers and enable your webcam when the browser prompts. To know more, check these articles from Luciano’s blog (which also point you to some related peer-reviewed publications):

Visitor: ok good

Bot: Good.

That last response is also from the chit-chat part of the database.

And now GPT-3 kicks in, when the user asks something very specific about molecularweb: its virtual modeling kit.

Visitor: and what is the virtual modeling kit?

Bot: Didn’t get that, can you rephrase? In the meantime I will search inside my database of articles to see if I can generate an answer using my GPT-3 module.

The bot first clarifies that it’s not very sure about what was asked, but it will try its GPT-3 module.

To enhance your experience chatting with me, please be careful with typos and grammar. Thanks!

Just in case there was a typo or a complicated question, it also clarifies how the chat can be more fluid.

I found some articles that might be of your interest. You can check the list of papers here.

From the abstracts of these papers, I will now try to address your question by using my GPT-3 module. I will come back to you with this soon… (it might take a minute or so)

If it finds articles containing the keywords in question (remember they are cleaned of symbols, stopwords, etc.), then it says this and it even creates a clickeable link to see the articles. Note it warns the user that this might take some time.

Last, when GPT-3 returned the results it displays them preceding the text with a reference to the original question.

Bot: Regarding your question ‘and what is the virtual modeling kit?’… The Virtual Modeling Kit (VMK) is a tool where users can build and view molecules, and explore their mechanics, in 3D AR by handling the molecules in full 3D with custom-printed cube markers (VMK 2.0) or by moving around a simulated scene with mouse or touch gestures (VMK 3.0). Upon simulation the molecules experience visually realistic torsions, clashes, and hydrogen-bonding interactions that the user can manually switch on and off to.

Another conversation, this time about the wild daffodils that grow in Switzerland

Based on information taken from a post of my blog that is included in the list of articles for few-shot learning:

A story on Switzerland’s wild daffodils

I asked "what soil do daffodils need to grow?". I know this information is somewhere in there. And it could find it:

Visitor: what soil do daffodils need to grow?

Bot: Not sure what you are asking, you better ask me again in a different way. In the meantime I will search inside my database of articles to see if I can generate an answer using my GPT-3 module.

To enhance your experience chatting with me, please be careful with typos and grammar. Thanks!

It said the above text because there’s no match at all to anything in the database. But as you see below it does find the article, so it goes through the GPT-3 module which can in fact reply the question!

I found some articles that might be of your interest. You can check the list of papers here.

Let me apply on them my GPT-3 module… (may take me a minute or so)

Bot: Regarding your question ‘what soil do daffodils need to grow?’… Daffodils like fresh soil, slightly moist and rich in nutrients, such as cow grazing fields -but without cows who will eat them! Daffodils want fresh air but not too cold, so here they grow from around 700 to 1800 m.a.s.l. They want sun, so they thrive in the open spaces of Forest facing the South..

It actually answered more than what I asked, but it’s OK, at least it is all information related to where these flowers grow.

Another cool thing enabled by GPT-3: it makes your bot multilingual!

If you ask my chatbot in languages like Spanish or French, if the question goes through the GPT-3 module then there is a good chance of getting an answer back. Here’s for example what I got what I asked it about molecularweb in these two languages:

Visitor: que es molecularweb?

Bot: In principle I don’t know what to say, but I will search inside my database of articles to try to come up with some answer using my GPT-3 module.

To enhance your experience chatting with me, please be careful with typos and grammar. Thanks!

I found some articles that might be of your interest. You can check the list of papers here.

Let me apply on them my GPT-3 module… (may take me a minute or so)

Bot: Regarding your question ‘que es molecularweb?’… es un sitio web que permite a los estudiantes y educadores manipular representaciones virtuales de átomos, moléculas, sus orbitales y complejos biomoleculares y estructuras biológicas, para enseñar y aprender sobre la estructura molecular, los orbitales atómicos y moleculares, y la teoría VSEPR; la estere.

Visitor: qu’est-ce que c’est molecularweb?

Bot: Please restate that. In the meantime I will search inside my database of articles to see if I can generate an answer using my GPT-3 module.

To enhance your experience chatting with me, please be careful with typos and grammar. Thanks!

I found some articles that might be of your interest. You can check the list of papers here.

Let me apply on them my GPT-3 module… (may take me a minute or so)

Bot: Regarding your question ‘qu’est-ce que c’est molecularweb?’… molecularweb est un site web pour l’éducation et la vulgarisation en chimie et en biologie structurale à travers du contenu en réalité augmentée (RA) qui s’exécute dans les navigateurs web de périphériques courants tels que les smartphones, les tablettes et les ordinateurs. Ici, nous présentons deux évolutions des kits de mod.

If you can read these languages you’ll confirm that it worked very well. The truncations at the end are independent of the language; they also happen sometimes in English. They have to do with the maximum length requested upon calling the API.

I found GPT-3 also capable of answering in italian and a few other languages. On the other hand I saw it fails in languages like hungarian, which is probably not well represented in the original corpus utilized to train it.


Want to play with this chatbot?

You can find it here:

http://lucianoabriata.altervista.org/chatbotallweb/chatbottest.html

Please do note that I’m constantly improving the chatbot, both at the level of code and at the level of its two databases (question-answer pairs and GPT-3-informing texts)… so you may experience problems if you enter right when I’m working on it. If that happens, try again later.

And remember that you need an OpenAI API key to use GPT-3, and that you can get some free credits by… well, ask my chatbot -for example with "how do i get an api key for gpt-3?"


www.lucianoabriata.com I write and photoshoot about everything that lies in my broad sphere of interests: nature, science, technology, programming, etc. Become a Medium member to access all its stories (affiliate links of the platform for which I get small revenues without cost to you) and subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here.


Related Articles