Amazon SageMaker Studio is a fully integrated IDE unifying the tools needed for managing your ML projects and collaborating with your team members.

Alongside providing pre-built images for running your notebooks, SageMaker Studio allows you to create containers with your favourite libraries and attach them as custom images __ to your domain.
In most large enterprises, ML platform administrators will manage those custom images to ensure only approved libraries are used by the Studio users. This can represent operational overhead for admins, if done manually.
In this post I show how you can automate a Studio domain setup by implementing simple continuous delivery for your custom images.

We will create a custom image based on Tensorflow 2.5.0 and attach it to a Studio domain. We will use CodeCommit, CodeBuild, and CodePipeline for continuous delivery of the custom image. Feel free to use any CI/CD tool of your choice in your environment. The stack is packaged in a CloudFormation template so you can repeat the setup on demand.
Visiting Bringing your own custom container image to Amazon SageMaker Studio notebooks, the SageMaker Studio Custom Image Samples, and the CodePipeline Tutorials could be a good start if those things sound new to you.
Walkthrough overview
We will tackle this in 3 steps:
- We will first launch a stack to create a CodeCommit repo, a CodeBuild project, and a CodePipeline Pipeline.
- Then, we will push the custom image Dockerfile, the configs, and the buildspec.yaml into the code repo. This will trigger the pipeline.
- Once the image is attached to the domain, we will launch a notebook with custom kernel in Studio.
Prerequisites
To go through this example, make sure you have the following:
- This is building on the Custom Image capability of SageMaker Studio. Make sure you are familiar with this blog and those code samples before starting.
- Have access to an Amazon SageMaker Studio environment and be familiar with the Studio user interface.
- Have an IAM role for the custom image. The permissions are given to the image when you will run it. You can use your Studio user profile role arn for example.
- This GitHub repository cloned into your environment
Step 1: Launching the CloudFormation stack
Architecture overview
First, we need to create a CloudFormation stack based on this template.
Below is the architecture overview for the setup:

The stack will create a CodeCommit repo where we will push Dockerfile, configs, and buildspec.yaml. It will also create a CodePipeline pipeline, and a CodeBuild project. The CodeBuild project will be in charge to execute the buildspec.yaml file to build and push the container image to ECR, and attach it to the Studio domain. The template will also create the base SageMaker image and other useful resources such as IAM roles.
Creating the stack in your account
To create the stack, follow these steps:
- Navigate to the AWS CloudFormation console page. Make sure you are doing this from the same AWS region as your SageMaker Studio Domain
- Select Upload a template file to create a stack with the template

- Choose Next

Here you will need to input a stack name, the Sagemaker Studio Domain ID and the role ARN for the image. You can leave ImageName as it is.

- Choose Next
- Leave all options as default until you reach the final screen
- Select I acknowledge that AWS CloudFormation might create IAM resources.

- Choose Create
This will take a few minutes to complete.

Step 2: Updating the code repo with your image
Next, you will need to push the content from the tf25 folder into the CodeCommit repo.

The following buildspec.yaml will be executed by CodeBuild:
It will run build_and_push.sh to push the container image to ECR based on this Dockerfile, and attach_image.sh to attach the ECR image to Studio.
Here you can also run further linting and testing before pushing the image to ECR. You can also enable image scanning on the ECR repository to identify common vulnerabilities.
Once the code is pushed to the master branch of the repo, you can navigate to CodePipeline and see your pipeline running.

After a few minutes, you should see your Studio domain updated with the latest version of the custom image.

From now on, every time you commit, Studio will automatically get a new version of your custom image.
If you have multiple custom images
update-domain-input.json is the config used to update your domain with custom images and their configs.
In this example I run the sagemaker update-domain command in attach_image.sh. If you have multiple custom images, you may keep the domain config file global and run the update command separately from the individual image pipelines.
Step 3: Using the custom image in Studio
Now you can use the custom image kernel in Studio
You will see it appear in the Studio launcher and can run your notebooks with it.

We can now use TensorFlow 2.5.0 in SageMaker Studio!

Conclusion
With custom images, you can run notebooks in Studio with your favourite libraries. In this post, we automated the image setup so ML platform administrators can create, version, and attach them automatically to a Studio domain.
To automate the setup further, you can also create a Service Catalog product out of the CloudFormation template used in this blog. You can also experiment with this example custom image from Lijo. It allows Studio users to run PySpark locally in their notebooks.