
In this tutorial you will learn how to use Hugging Face DLCs for implementing pretrained models and deploy them on a website in order to create chatbots. While this tutorial is focused on conversational type models, you can use the key concepts for any kind of applications.

This article was authored by Roberto Zappa
Nowadays, thanks to Amazon AWS, we have a ton of different methods to implement AI models allowing us to create applications that were unattainable a few years ago. Recently Amazon announced the possibility of implementing AWS Deep Learning Containers (DLCs) with Hugging Face simplifying the deployment of models in Amazon SageMaker. In this article I want to use this new method combined with different AWS services in order to create a website with chatbot. I will use the pretrained DialoGPT-medium model available on Hugging Face.
The final architecture will look like this:

By the end of this tutorial you will be able to:
- Create SageMaker endpoint using Hugging Face DLCs
- Create lambda function and call inside it our endpoint
- Create, set, and connect API Gateway with lambda function
- Create a web page on Amplify and connect it to our Gateway
Ready? Let’s go
Create SageMaker endpoint using Hugging Face DLCs
Choose from Hugging Face the model that you want to use and click the button "Deploy" → "Amazon SageMaker"

Now you have to choose the Task and Configuration of the model. Different tasks involve different data structures that we have to provide to our model in order to make predictions, you can see different data structures here. Select "Conversational" and "AWS".

Copy and run the code generated inside your Jupyter notebook in the SageMaker notebook instance that you want to use.
Pretty simple right?
Warning
While I’m writing this tutorial there is a bug that doesn’t permit making predictions in SageMaker with conversational type models. In order to bypass this problem a few changes in code are needed. No worry, you can copy my code:
We can see the endpoint just created through the SageMaker console → "Inference" → "Endpoint". Copy its name since it is needed in the following chapters. Furthermore, remember to delete the endpoint when it is not necessary anymore to not incur in unnecessary cost. You can delete it through the SageMaker console or running inside your Jupyter notebook the code below:
predictor.delete_endpoint()
Create lambda function and call inside it our endpoint
Now we have to create our lambda function. Log into AWS Lambda Console and click the button "Create function". Name it, select the language that you want(I choose Python 3.8) and click on "Create function".

In the "Code" section copy inside "lambda_function.py" this code:
To be able to connect the endpoint to this function select "Configuration" → "Environment variables" → "Edit" and write under Key "ENDPOINT_NAME" and under Value the name of the endpoint copied in the previous section. Click on "Save".

Last but not least we have to add the right permission to our function: select "Configuration" → "Permissions" and click on Role name.

Grant to your policy this permission:
Create, set, and connect API Gateway with lambda function
In API Gateway click on "Create API" and click on "Build" in REST API

In the following page give to your API the name you want and select for "Endpoint Type" "Edge optimized" and click the blue button "Create API"

From "Actions" select "Create Method"

From the new method select "POST" and confirm it. Select the POST method and choose for "Integration type" Lambda Function. After writing the name of your lambda function click on the "Save" button and in the following page click on the "OK" button.

Now again from "Action" select "Enable COREs" and without changing anything click on "Enable CORS and replace existing CORS headers" button and in the following message click on "Yes, replace existing values".
Last thing to do is to deploy it: from "Action" select "Deploy API". In "Deployment stage" select "[New Stage]", choose a name and click the button "Deploy".

In the next page you will see an "Invoke URL". Copy and use it inside your HTML file to be able to invoke your API Gateway.

Test our chatbot
We almost finish!
Now that we have our Invoke URL we can test our chatbot using the following python program. The only changes you need to make is to replace the URL in the def main() function with your invoke URL.
If everything works fine we can move on connecting the chatbot to our web page.
Create a web page on Amplify and connect it to our Gateway
In AWS Amplify select the button "New app" → "Host web app" and in the following page select "Deploy without Git provider" and click on "Continue".

Now decide the names like you want and drag and drop your html/css code. The HTML file has to be named "index.html". If you want to upload only the HTML file you have to compress it in a zip format. Otherwise, create a folder, give it the name that you prefer and put inside it the html/css files. In order to connect your Amplify web app with your API Gateway inside your HTML code you have to use the URL provided by the API Gateway when you want to send/receive data from your model.
And that’s all!
Conclusion and future improvements
In this tutorial we learn how to deploy pretrained Hugging Face models to Amazon SageMaker and connect SageMaker endpoints to other AWS services. In the future it could be useful to learn how to change our pretrained models in order to permit different use cases and surpass the standard models limitations, but that’s another story 😁 .
Thank you for reading! I hope this helped you.
About the author
Hi, I am Roberto Zappa, Machine Learning Engineer at Neosperience. I am working with ML technologies and AWS infrastructure. Passionate about technology, I love Machine Learning and Computer Vision.
You can find me on Medium or linkedin
Neosperience unlocks the power of empathy with software solutions that leverage AI to enable brands to understand, engage and grow their customer base. Reach out at www.neosperience.com.