The world’s leading publication for data science, AI, and ML professionals.

Managing Scripts on AI Platform with GCP Cloud Source Repository

A tutorial to share the steps to manage and share scripts via GCP Cloud Source Repository

Photo by Thiago Barletta taken from Unsplash
Photo by Thiago Barletta taken from Unsplash

Introduction

Previously, I used to have multiple Jupyter notebooks under the component Google Cloud AI Platform. When I would like to copy a script from Notebook-1 to Notebook-2, I will start the instance for Notebook-1 and download the script from there to be uploaded to Notebook-2. This approach is inefficient and will incur more cost as an instance has been started. Then my teammates suggested collaboration with Google Cloud Source Repository – which was a brilliant idea!

So what is Google Cloud Source Repository?

Google Cloud Source Repositories is a great platform for managing and sharing codes among team members or sharing your scripts with your client at the end of the project. Cloud Source Repository can be the single place for the team to share scripts, manage and track changes. Users can push an updated version of their scripts from their notebook to the Cloud Source Repository easily.

Google Cloud Source Repository can be connected to many other GCP components such as App Engine, Cloud Build, etc. However, this article will share the steps required to create a repository and push scripts from your AI Platform Jupternotebook to Cloud Source Repositories and clone from Cloud Source Repositories to AI Platform Jupternotebook.

Let’s begin our tutorial! (This tutorial assume that a GCP Project has already been setup)

Creating a Cloud Source Repository:

Begin by navigating to Cloud Source Repositories from your GCP console drop-down menu.

Step 1: Open Cloud Source Repository
Step 1: Open Cloud Source Repository

Select the option to "add a repository" and choose "Create new repository"

Step 2: Create a new repository
Step 2: Create a new repository

Provide a "Repository name" and specify the "Project" that you would like the Cloud Source Repository to be.

Step 3: Specify "Repository Name" and "Project"
Step 3: Specify "Repository Name" and "Project"

Once your repository has been successfully created, a guide will share how you can add your code to the repository. There are 3 methods:

  • SSH Authentication
  • Google Cloud SDK
  • Manually generated credentials

Let’s follow the "Google Cloud SDK" method, which provides us the commands required to push our script to the repository.

Step 4: Select & Follow method - Google Cloud SDK
Step 4: Select & Follow method – Google Cloud SDK

Clone Repository to Jupyternotebook

Let’s now open our Jupyternotebook and see how we can clone our newly created repository to our Jupyternotebook. To run the code provided in our notebook, we will need to launch the "Terminal".

Step 5: Launch "Terminal" inside Jupternotebook
Step 5: Launch "Terminal" inside Jupternotebook

This will launch a new "Terminal Tab". We can now begin cloning our newly created Cloud Source Repository in this "Terminal Tab" by running the set of commands provided

Begin by running the first command to setup authentication credentials:

gcloud init 
Step 6: Provide Authentication Credentials
Step 6: Provide Authentication Credentials

Next, clone your newly created Cloud Source Repository

gcloud source repos clone Our-Scripts-Folder --project=sue-gcp-learning-env
Step 7: Clone Repository
Step 7: Clone Repository

Noted on the Warning message display as we are cloning an empty repository. Let’s switch to this new clone repository and add files to it.

cd Our-Scripts-Folder
Step 8: Switch to the new clone repository
Step 8: Switch to the new clone repository

Upload some files into the new clone repository folder. In this example, I uploaded 2 files into the folder – ‘Our-Scripts-Folder’

Step 9: Upload files/scripts to the new folder
Step 9: Upload files/scripts to the new folder

Pushing to Cloud Source Repository

Once finished uploading, we will need to add and commit them before pushing them back to the main cloud source repository.

In addition, you will need to set up your details before commit by running the following commands:

Git config - global user.email "[email protected]"
git config - global user.name "Your Name"
Step 10: Update user details
Step 10: Update user details

Now, you can add, commit and push your code to Cloud Source Repository.

git add . 
git commit -m "type your commit message here"
git push -u origin master
Step 11: add, commit and push
Step 11: add, commit and push

Upon successfully push the files, check if they are uploaded in the Cloud Source Repository.

Step 12: Perform checking
Step 12: Perform checking

Congratulations! The files have been successfully uploaded to Cloud Source Repository.

Now, let’s work the other way round. Assume that we are sharing this set of scripts with our new colleague. Our new colleague will need to clone the scripts from the Cloud Source repository into his/her Jupyternotebook. Let’s see how this can be done!

Clone and Pull from Cloud Source Repository

In Cloud Source Repository, there is the option "Clone" to show us how we can clone the repository and pull the codes. Select the "+Clone" option and choose "How to setup?".

Step 13: Select the "+ Clone" option
Step 13: Select the "+ Clone" option

Similar to before, there are 3 methods to clone. Let’s select the option – Google Cloud SDK which provides the commands required for our new colleague to clone the repository.

Clone the repository by running the clone command in the Terminal

gcloud source repos clone Our-Scripts-Folder --project=sue-gcp-learning-env
Step 14: Clone Repository
Step 14: Clone Repository

That’s all the step that is required and the repository has been clone with the files and script in the folder.

If there are new changes in the Cloud Source Repository. The following commands will need to be executed to pull the latest version of files/scripts.

Switch to the new clone repository location then pull the updated files/ scripts.

cd Our-Scripts-Folder
git pull origin master

These are the steps you will need to manage your codes on Jupternotebook with Cloud Source Repository.


Conclusion:

Using Google Cloud Source Repository to manage team member’s scripts on different AI Platform Notebooks is more efficient and changes can be tracked. The process of documentation and handing over to another member will be more smooth too.

Thanks for reading this article, I hope this helps anyone out there starting on GCP Cloud Source Repository.

References & Links:

[1] https://cloud.google.com/source-repositories/docs/

[2] https://blog.peterschen.de/use-cloud-source-repositories-in-jupyter-notebooks/


Related Articles