
Motivation:
In the first chapter, we successfully built a Chatterbot in PyCharm. Now it’s time to train our Chatbot to answer Frequently Asked Questions (FAQ, whatever these are in your company). In this story, we will learn how to do that (what means training in this context?), and we’ll have a deeper look at why our Chatbot (hopefully) knows how to answer our questions correctly. This is because we want to create the smartest FAQ Chatbot possible, so now let’s roll up our sleeves.
Solution:
In retrieval-based chatbot models, heuristics like Levenshtein or Cosine Similarity select an output from a library of predefined inputs. That makes up our Chatbot’s knowledge. This just sounds like explicit "if-else" coding, so where is the Machine Learning in Chatterbot, you might ask? Well, if it were only explicit coding, that would mean that our Chatbot would fail to give the right answer to a question each time it is asked slightly differently from what has been coded to the library. In plain English, if we told our Chatbot, the answer to the question "Do colorful shoes matter?" would be "Sure, colors make the world bright" in traditional coding. Our Chatbot will fail to answer correctly if an Englishman asks, "Do colourful shoes matter?".

Luckily, we do not have to hand-code trillions of rules hard when we use Chatterbot. But we use Chatterbot’s Levenshtein package for calculating the similarity between strings (mapping of user’s input to our FAQ Input-Output Chats.txt):

Our Bot will be trained both on our Chats.txt file as well as on the Chatterbot corpus:

Via training, we make sure the Bot knows what answers to give:

So when we start the ChatterBot and ask it a question:

..the ChatterBot will think hard:

..and give the answer which is closest related to this question, via e.g. using Levenshtein.
As a short excursion, let’s understand how Levenshtein basically works sticking to our simple "colorful shoes" Jupyter Notebook example. We only need to import:
import Levenshtein
..to be able to use Leveshtein’s distance function. The result for two identical sentences is obviously a distance of 0, as we can see in our example:
Levenshtein.distance(
'Do colorful shoes matter',
'Do colorful shoes matter')

Now let us slightly amend this sentence, this time using the British word for "colour":
Levenshtein.distance(
'Do colorful shoes matter',
'Do colourful shoes matter')

Finally, if we enter five more words at the end of the sentence (including 2 spaces), we receive a Levenshtein Distance of 8:
Levenshtein.distance(
'Do colorful shoes matter',
'Do colourful shoes matter at all')

Levenshtein (also known as Edit Distance) is very useful to compare words or similar sentences.
In contrast, Cosine Similarity makes sense when comparing complete sentence(s) which differs more greatly. If we would stick to Levenshtein in these cases, Levenshtein very likely will compute a relatively big distance, even though the two sentences **** convey very similar key information:
Levenshtein.distance(
'Do colorful shoes matter',
'At all, do colourful shoes really matter')

Thanks to Chatterbot’s Machine Learning classifiers, our Chatbot can use the context of the conversation to select the hopefully best response from a predefined list of messages. So our Chatbot not only understands that colors and colours are most likely the same, but the bot should also know that shoes are closely related to clothing (even though we have never trained our Chatbot with the word clothing ever before). The theoretical background for this is out of this post’s scope. Still, it is fair enough for us to understand that mainly thanks to NLTK (Natural Language Toolkit) and, e.g., Cosine Similarity, our Chatbot’s corpus understands that there is a connection between shoes and clothing. Have a look at this impressive visualization to get an idea of a high-dimensional word vector space:
Embedding projector – visualization of high-dimensional data

In simple words, our Chatbot looks at the input and searches for an input it has been trained on that is closest to the user’s question. Then, the Bot returns the answer linked to that question.
To tune these parameters, look into Views.py for our Logic Adapter settings. The adapters determine the logic for how ChatterBot selects a response to a given input statement, and there are multiple ways of setting them. We have used BestMatch in our App. BestMatch will return a response based on known reactions to the closest matches to the input (user’s question). We have set the maximum amount of similarity between two statements that are required to be 90%, using Maximum Similarity to calculate the bot’s confidence:

As is often the life case, there is no absolute 100% right or wrong to the question of how to set these parameters. It simply depends on your FAQ background and which logic you want to use. Should your ChatterBot only respond if it is sure to give the correct answer? Or should it behave more like a chatterbox, which provides answers every time, even if it is not always the best one? The only honest advice I can give you is to try it out on your data (and reference the great ChatterBot website for further details).
Congratulations. We have now a complete chatbot running and we know how to train it to give the best possible answers to our questions! We have merely concentrate on the basic question-answer mapping, but the context could also include the current position in a specific dialogue flow, or previously saved variables like subject matter (e.g. "do you want the Chatbot to answer questions from topic a or b?"). Conducting dialog flow conversations will be the topic of another post yet to come.
For the time being, many thanks for reading! I hope this article is helpful for you. Feel free to connect with me on LinkedIn, Twitter or Workrooms.
Originally published on my website DAR-Analytics.