
I normally hate click-bait titles, so I’ll start this article off by promising this is not one. The HuggingFace Hub contains thousands of different NLP models tailored for various NLP tasks ranging from summarization to text generation. Earlier this year, Amazon SageMaker partnered with HuggingFace to release AWS managed HuggingFace containers, making it easier than ever to bring HF models and scripts to the cloud. In this article, we’ll explore a quick example of deploying the bert-base-uncased pre-trained model from the HF Hub on AWS SageMaker with a few simple lines of code. Prior to carrying on with this article, some prior knowledge of AWS and how SageMaker operates is expected to fully understand all of the content.
Table of Contents
- Setup
- HF Deployment on SageMaker
- Conclusion/Additional Resources
Setup
For this example, we’ll be working with traditional SageMaker notebook instances. First go to SageMaker on the AWS Console, and then click on Notebook Instances. Within Notebook Instances, you’ll have the option to create a Notebook Instance. Here you can select an instance with appropriate computing power, this example is not that intensive so we can go with a cheaper ml.t2.medium.

Next you’ll be required to create a role, which gives your instance permissions to work with other services if necessary. For this use case keep the default execution role with SageMaker full access as we will not be working with any other AWS Services. Next create the Notebook Instance and you should be given an option to access a Jupyter Lab environment that you can work in. Here you can create a Notebook with an appropriate Kernel. The Python3 Kernel will be adequate for this example. Now we have a Notebook where we can deploy our HF model.

HF Deployment on SageMaker
To access the model that we are working with go to the following HF Hub link. This model is pre-trained and can complete a variety of tasks, we’ll be using it specifically for text classification. What’s awesome about using pre-trained models from the Hub is that they already give us instructions for deploying on SageMaker as they’re integrated together. Once on the HF Hub link, in the top right you’ll notice deploy. If you click on the options you see Amazon SageMaker.

After clicking on SageMaker, you can pick the task type (text classification) and for configuration pick AWS. This will provide the boilerplate code that you can utilize in your Notebook Instance to deploy this specific model.
The HF Model ID and Task is how the HuggingFace container will understand what model we are working with and what problem it is trying to solve. We can now define the HuggingFace Model built-in through SageMaker.
After creating the HuggingFace Model the last and simple step is deploying it for inference with a SageMaker real-time endpoint.
After the endpoint is deployed successfully we can quickly test some sample inference.
With this we should see text classification results display with the majority class.

Make sure to delete the endpoint to not incur any additional costs if not using it currently.
Conclusion
HuggingFace has a vast set of NLP resources and SageMaker helps bring these models to an incredible scale. Deploying from the HuggingFace Hub is just one method of deployment with SageMaker. You can also deploy HuggingFace models from Model Data in S3. You can also train HuggingFace estimators and then deploy with a custom inference script that lets you manage input and output. For further exploration I’ve attached some examples and articles in the Additional Resources category.
I hope this article has been useful for people working with Amazon SageMaker and HuggingFace. Feel free to leave any feedback in the comments or connect with me on LinkedIn if interested in chatting about ML & AWS. Make sure to follow me on Medium if interested in more of my work. Thank you for reading.