The world’s leading publication for data science, AI, and ML professionals.

A Practical Guide to MLOps in AWS Sagemaker – Part I

How to implement a CI/CD model development and evaluation pipeline in AWS Sagemaker.

Van Gogh, Vincent. The Starry Night. (source)
Van Gogh, Vincent. The Starry Night. (source)

This guide results from my own frustration to find a complete end-to-end work on model development, evaluation, deployment on AWS. All the guides, tutorials I saw out there only cover part of the picture and never fully connect the dots. I wanted to write something that will help people understand the complete work that goes into building a model and deploying it such that it can be accessed by front-end developers on their websites and Apps.

So, let us get started!

I have divided this guide into 2 parts.

  1. Model Development and Evaluation using AWS Sagemaker Studio.
  2. Model Deployment using AWS Lambda and REST API’s

Prerequisites:

· AWS Account – The cost to run the entire tutorial will be less than $0.50 so do not worry.

· Understanding of Python – Most of the Machine Learning work today is being done in Python.

· Patience – Failure is the most important prerequisite to success, so keep on trying until it works.

Part 1: Model Development

We will set up a Project in Sagemaker Studio to build our development pipeline.

  1. log in to your AWS Account and Select Sagemaker from the list of services.
  2. Select Sagemaker Studio and use Quickstart to create Studio.
Use the quick start option to set up a sagemaker studio. (Image by author)
Use the quick start option to set up a sagemaker studio. (Image by author)

Once the Studio is Ready, Open Studio with the user you just created. It might take few minutes for the Application to be created, but once everything is set up, we can create our project. The thing to understand is, we can only create one Studio, but multiple users in that studio, and every user can create multiple projects in the studio.

Sagemaker control panel. (Image by author)
Sagemaker control panel. (Image by author)
  1. Select Sagemaker components and registries from the left navbar and select Create Projects.
Create project option. (Image by author)
Create project option. (Image by author)

By default, Sagemaker provides templates that can be used to build, evaluate host models. We will use one such template and modify it to fit our use case.

  1. Select the MLOps template for model development, evaluation, and deployment from the list and create a project.
Sagemaker project templates. (Image by author)
Sagemaker project templates. (Image by author)

Once your new project is created, you will find 2 pre-built repositories. The first one defines your model development and evaluation and the other build your model into a package and deploys it into an endpoint for consumption by API. In this guide, we will modify the first template to run our own use case.

  1. Clone the first repository so we can modify the files we need.
Project repositories. (Image by author)
Project repositories. (Image by author)

The use case we will be working on is one of the Customer Churn Models to predict if a customer is likely to unsubscribe to the services in the future. As the idea behind this notebook is to learn model development and deployment in Cloud, I will not go into the data exploration and directly jump into pipeline development.

This is the file structure of the repository we just cloned now let us go over some of the files we will be working with.

Repository file structure. (Image by author)
Repository file structure. (Image by author)

· The folder pipelines contain the file needed to create our model development pipeline, by default this pipeline is named abalone.

· pipeline.py defines the components of our pipeline, currently, it is defined with default values, but we will change the code for our use case.

· preprocess.py and evaluate.py define the code that we need to execute for preprocessing and evaluation steps in our pipeline.

· codebuild-buildspec-yml creates and orchestrates the pipeline.

You can add more steps into pipeline.py and corresponding processing files, the templates also have a test folder defined along with a test_pipelines.py file which can be used to build a separate test pipeline.

  1. Rename the folder abalone to customer-churn make the change in the codebuild-buildspec-yml file to reflect that change.
run-pipeline --module-name pipelines.customer_churn.pipeline 
  1. We need to download the data into our default AWS s3 bucket for consumption, we can do this using a notebook. Create a new notebook in the repository from the File tab in the studio, select a kernel with a basic Data Science python package and paste the below code in the cell and run.
!aws s3 cp s3://sagemaker-sample-files/datasets/tabular/synthetic/churn.txt ./
import os
import boto3
import sagemaker
prefix = 'sagemaker/DEMO-xgboost-churn'
region = boto3.Session().region_name
default_bucket = sagemaker.session.Session().default_bucket()
role = sagemaker.get_execution_role()
RawData = boto3.Session().resource('s3')
.Bucket(default_bucket).Object(os.path.join(prefix, 'data/RawData.csv'))
.upload_file('./churn.txt')
print(os.path.join("s3://",default_bucket, prefix, 'data/RawData.csv'))

Now we need to modify the code inside pipeline.py, evaluate.py, and preprocess.py to fit our need.

  1. For the sake of the guide, copy the code from the link to update code in pipeline.py, preprocess.py, and evaluate.py but make sure to go through the code a better understanding of the details.

All set, once we update the code in these 3 files, we are ready to run our first pipeline execution, but as we are trying to implement CI/CD template this will automatically be taken care of once we commit and push our code.

  1. Select the GIT tab from the side nav bar and select the files you have modified to add to the staging area and commit and push changes to the remote repository.
Commit changes and push code to remote. (Image by author)
Commit changes and push code to remote. (Image by author)

Now go to the pipeline tab on the project page and select the pipeline you created to check the executions running, you should find one Succeeded job which got automatically executed when we cloned the repository the other one would be in Executing state which you just executed by pushing code, now double click on this to check the pipeline diagram and more details.

Execution tab of your pipeline. (Image by author)
Execution tab of your pipeline. (Image by author)

Hurray!! Congrats you just executed your first training job.

Pipeline diagram. (Image by author)
Pipeline diagram. (Image by author)

Unless something goes wrong, you should see your job Succeeded, but remember if it were easy anyone would do it. Failure is the first step to success.

The road to success is a rough one. (Image by author)
The road to success is a rough one. (Image by author)

Once the pipeline completes it will create a model and add it to your model group, as we have added a model approval condition as "Manual" in the pipeline we will need to select the model and approve it manually to create an Endpoint that can be used for inference.

  1. Go to the Model Group tab on your project home page and select the model which has been created, you can check the Metrics page to review the results of the evaluation phase.
Metrics of your model. (Image by author)
Metrics of your model. (Image by author)
  1. If you are satisfied with the metrics, you can select the Approval option in the top right corner to approve the model.
Model approval page. (Image by author)
Model approval page. (Image by author)

Here is when our second repository comes into the picture, once you approve the model deployment pipeline defined in the second repository will execute to deploy and host a new ENDPOINT which we can use to make inferences from our API.

Conclusion

I have tried to keep this guide to the point of using Sagemaker as it is long anyways and there is still a part 2 to come. The objective here is to give a quick overview of the different components of Sagemaker through implementing a simple project. My suggestion to readers would be not to follow the guide step by step and experiment with your own ideas and steps, you will fail often but you will learn a lot, and that is the agenda. Hope you enjoy working through this guide as much as I have enjoyed putting it together. Feel free to drop any suggestion or feedback in comments, would love to hear them.

A Practical Guide to MLOps in AWS Sagemaker – Part II


Related Articles