The world’s leading publication for data science, AI, and ML professionals.

Seamless CI/CD Pipelines with GitHub Actions on GCP: Your Tools for Effective MLOps

Lesson 7: Deploy All the ML Components to GCP. Build a CI/CD Pipeline Using Github Actions.

THE FULL STACK 7-STEPS MLOPS FRAMEWORK

Photo by Hassan Pasha on Unsplash
Photo by Hassan Pasha on Unsplash

This tutorial represents lesson 7 out of a 7-lesson course that will walk you step-by-step through how to design, implement, and deploy an ML system using MLOps good practices. During the course, you will build a production-ready model to forecast energy consumption levels for the next 24 hours across multiple consumer types from Denmark.

By the end of this course, you will understand all the fundamentals of designing, coding and deploying an ML system using a batch-serving architecture.

This course targets mid/advanced machine learning engineers who want to level up their skills by building their own end-to-end projects.

Nowadays, certificates are everywhere. Building advanced end-to-end projects that you can later show off is the best way to get recognition as a professional engineer.


Table of Contents:

  • Course Introduction
  • Course Lessons
  • Data Source
  • Lesson 7: Deploy All the ML Components to Gcp. Build a CI/CD Pipeline Using Github Actions.
  • Lesson 7: Code
  • Conclusion
  • References

Course Introduction

At the end of this 7 lessons course, you will know how to:

  • design a batch-serving architecture
  • use Hopsworks as a feature store
  • design a feature engineering pipeline that reads data from an API
  • build a training pipeline with hyper-parameter tunning
  • use W&B as an ML Platform to track your experiments, models, and metadata
  • implement a batch prediction pipeline
  • use Poetry to build your own Python packages
  • deploy your own private PyPi server
  • orchestrate everything with Airflow
  • use the predictions to code a web app using FastAPI and Streamlit
  • use Docker to containerize your code
  • use Great Expectations to ensure data validation and integrity
  • monitor the performance of the predictions over time
  • deploy everything to GCP
  • build a CI/CD pipeline using GitHub Actions

If that sounds like a lot, don’t worry. After you cover this course, you will understand everything I said before. Most importantly, you will know WHY I used all these tools and how they work together as a system.

If you want to get the most out of this course, I suggest you access the GitHub repository containing all the lessons’ code. This course is designed to quickly read and replicate the code along the articles.

By the end of the course, you will know how to implement the diagram below. Don’t worry if something doesn’t make sense to you. I will explain everything in detail.

Diagram of the architecture you will build during the course [Image by the Author].
Diagram of the architecture you will build during the course [Image by the Author].

By the end of Lesson 7, you will know how to manually deploy the 3 ML pipelines and the web app to GCP. Also, you will build a CI/CD pipeline that will automate the deployment process using GitHub Actions.


Course Lessons:

  1. Batch Serving. Feature Stores. Feature Engineering Pipelines.
  2. Training Pipelines. ML Platforms. Hyperparameter Tuning.
  3. Batch Prediction Pipeline. Package Python Modules with Poetry.
  4. Private PyPi Server. Orchestrate Everything with Airflow.
  5. Data Validation for Quality and Integrity using GE. Model Performance Continuous Monitoring.
  6. Consume and Visualize your Model’s Predictions using FastAPI and Streamlit. Dockerize Everything.
  7. Deploy All the ML Components to GCP. Build a CI/CD Pipeline Using Github Actions.
  8. [Bonus] Behind the Scenes of an ‘Imperfect’ ML Project – Lessons and Insights

As Lesson 7 focuses on teaching you how to deploy all the components to GCP and build a CI/CD pipeline around it, for the full experience, we recommend you watch the other lessons of the course.

Check out Lesson 4 to learn how to orchestrate the 3 ML pipelines using Airflow and Lesson 6 to see how to consume the model’s predictions using FastAPI and Streamlit.


Data Source

We used a free & open API that provides hourly energy consumption values for all the energy consumer types within Denmark [1].

They provide an intuitive interface where you can easily query and visualize the data. You can access the data here [1].

The data has 4 main attributes:

  • Hour UTC: the UTC datetime when the data point was observed.
  • Price Area: Denmark is divided into two price areas: DK1 and DK2 – divided by the Great Belt. DK1 is west of the Great Belt, and DK2 is east of the Great Belt.
  • Consumer Type: The consumer type is the Industry Code DE35, owned and maintained by Danish Energy.
  • Total Consumption: Total electricity consumption in kWh

Note: The observations have a lag of 15 days! But for our demo use case, that is not a problem, as we can simulate the same steps as it would in real-time.

A screenshot from our web app showing how we forecasted the energy consumption for area = 1 and consumer_type = 212 [Image by the Author].
A screenshot from our web app showing how we forecasted the energy consumption for area = 1 and consumer_type = 212 [Image by the Author].

The data points have an hourly resolution. For example: "2023–04–15 21:00Z", "2023–04–15 20:00Z", "2023–04–15 19:00Z", etc.

We will model the data as multiple time series. Each unique price area and consumer type tuple represents its unique time series.

Thus, we will build a model that independently forecasts the energy consumption for the next 24 hours for every time series.

Check out the video below to better understand what the data looks like 👇


Lesson 7: Deploy All the ML Components to GCP. Build a CI/CD Pipeline Using Github Actions.

The goal of Lesson 7

Within Lesson 7, I will teach you 2 things:

  1. How to manually deploy the 3 ML pipelines and the web app to GCP.
  2. How to automate the deployment process with a CI/CD pipeline using GitHub Actions.
Diagram of the final architecture with the Lesson 7 components highlighted in blue [Image by the Author].
Diagram of the final architecture with the Lesson 7 components highlighted in blue [Image by the Author].

In other words, you will take everything you have done so far and show it to the world.

As long your work sits on your computer, it can be the best ML solution in the world, but unfortunately, it won’t add any value.

Knowing how to deploy your code is critical to any project.

So remember…

We will use GCP as the cloud provider and GitHub Actions as the CI/CD tool.


Theoretical Concepts & Tools

CI/CD: CI/CD stands for continuous integration and continuous delivery.

The CI step mainly consists of building and testing your code every time you push code to git.

The CD step automatically deploys your code to multiple environments: dev, staging, and production.

Depending on your specific software requirements, you need or do not need all the specifications of a standard CI/CD pipeline.

For example, you might work on a proof of concept. Then a staging environment might be overkill. But having a dev and production CD pipeline will drastically improve your productivity.

GitHub Actions: GitHub Actions is one of the most popular CI/CD tools out there in the wild. It is directly integrated into your GitHub repository. The sweet part is that you don’t need any VMs to run your CI/CD pipeline. Everything is running on GitHub’s computers.

You need to specify a set of rules within a YAML file, and GitHub Actions will take care of the rest. I will show you how it works in this article.

GitHub Actions is entirely free for public repositories. How awesome is that?

As a side note. Using GitHub Actions, you can trigger any job based on various repository events, but using it as a CI/CD tool is the most common use case.


Lesson 7: Code

You can access the GitHub repository here.

Note: All the installation instructions are in the READMEs of the repository. Here you will jump straight to the code.

The code and instructions for Lesson 7 are under the following:

  • deploy/ – Docker and shell deploying files
  • .github/workflows – GitHub Actions CI/CD workflows
  • _README_DEPLOY_ – README dedicated to deploying the code to GCP
  • _README_CICD_ – README dedicated to setting up the CI/CD pipeline

Prepare Credentials

Directly storing credentials in your git repository is a huge security risk. That is why you will inject sensitive information using a .env file.

The .env.default is an example of all the variables you must configure. It is also helpful to store default values for attributes that are not sensitive (e.g., project name).

A screenshot of the .env.default file [Image by the Author].
A screenshot of the .env.default file [Image by the Author].

To replicate this article, you must set up all the infrastructure and services used during the course.

2 main components can be deployed separately.

#1. The 3 ML pipelines:

  • Feature Pipeline
  • Training Pipeline
  • Batch Prediction Pipeline

For #1., you have to set up the following:

#2. Web App:

Fortunately, for #2., you have to set up only the GCP GCS buckets used as storage.

But note that if you do only section #2., you won’t have any data to consume within your web app.

We don’t want to overflow this article with boring stuff, such as setting up credentials. Still, fortunately, if you’re going to implement and replicate the entire course, you have step-by-step instructions in previous articles and the GitHub README.

If you want to see (and not replicate) how we deployed our code to GCP and built the GitHub Actions workflows, you don’t have to bother with any of the credentials. Just proceed to the following sections ✌️

NOTE: The only service that doesn’t have a freemium plan is within this lesson. When I wrote this course, deploying and testing the infrastructure on GCP cost me ~20$. But I had a brand new GCP account that offered me 300$ in GCP credits, thus indirectly making it free of charge. Just remember to delete all the GCP resources when you are done, and you will be OK.


Manually Deploy to GCP

So, let’s manually deploy the 2 main components to GCP:

  • ML Pipeline
  • Web App

But, as a first step, let’s set up all the GCP resources we need for the deployment. After, you will SSH to your machines and deploy your code.

_For more info, access the GitHub deployment README._


Set Up Resources

Let’s go to your GCP _energyconsumption project and create the following resources:

  1. Admin VM Service Account with IAP Access
  2. Expose Ports Firewall Rule
  3. IAP for TCP Tunneling Firewall Rule
  4. VM for the Pipeline
  5. VM for the Web App
  6. External Static IP

Don’t get discouraged by the fancy names. You will have access to step-by-step guides using this article + the GCP documentation I will provide.

Note: If you don’t plan to replicate the infrastructure on your GCP infrastructure, skip the "Set Up Resources" section and go directly to "Deploy the ML Pipeline".


#1. Admin VM Service Account with IAP Access

We need a new GCP service account with admin rights & IAP access to the GCP VMs.

You have to create a new service account and assign to it the following roles:

  • Compute Instance Admin (v1)
  • IAP-secured Tunnel User
  • Service Account Token Creator
  • Service Account User

IAP stands for Identity-Aware Proxy. It is a way to create tunnels that route TCP traffic inside your private network. For your knowledge, you can read more about this topic using the following docs (you don’t have to understand it to proceed to the next steps):


#2. Expose Ports Firewall Rule

Create a firewall rule that exposes the following TCP ports: 8501, 8502, and 8001.

Also, add a target tag called energy-forecasting-expose-ports.

Here are 2 docs that helped us create and configure the ports for the firewall rule:

Here is what our firewall rule looks like 👇

Screenshot of the GCP "expose ports" firewall rule [Image by the Author].
Screenshot of the GCP "expose ports" firewall rule [Image by the Author].

#3. IAP for TCP Tunneling Firewall Rule

Now we will create a firewall rule allowing IAP for TCP Tunneling on all the VMs connected to the default network.

Step-by-step guide on how to create the IAP for TCP tunneling Firewall rule [6].

Here is what our firewall rule looks like 👇

Screenshot of the GCP "IAP TCP forwarding" firewall rule [Image by the Author].
Screenshot of the GCP "IAP TCP forwarding" firewall rule [Image by the Author].

#4. VM for the Pipeline

Go to your GCP _energyconsumption project -> VM Instances -> Create Instance.

Choose e2-standard-2: 2 vCPU cores – 8 GB RAM as your VM instance type.

Call it: ml-pipeline

Change the disk to 20 GB Storage.

Pick region europe-west3 (Frankfurt)` and zone europe-west3-c. Here, __ you can pick any other region & zone, but if it is your first time doing this, we suggest you do it like us.

Network: default

Also, check the HTTP and HTTPS boxes and add the energy-forecasting-expose-ports custom firewall rule we did a few steps back.

Here are 2 docs that helped me create and configure the ports for the firewall rule:


#5. VM for the Web App

Now let’s repeat a similar process for the Web App VM, but with slightly different settings.

This time choose e2-micro: 0.25 2 vCPU – 1 GB memory as your VM instance type.

Call it: app

Change the disk to 15 GB standard persisted disk

Pick region europe-west3 (Frankfurt) and zone europe-west3-c.

Network: default

Also, check the HTTP and HTTPS boxes and add the energy-forecasting-expose-ports custom firewall rule we created a few steps back.


#6. External Static IP

This is the last piece of the puzzle.

If we want the external IP for our web app to be static (aka not to change), we have to attach a static address to our web app VM.

We suggest adding it only to the app VM we created a few steps ahead.

Also, adding a static external IP to the ml-pipeline VM is perfectly fine.

Docs on reserving a static external IP address [7].

Now that the boring part is finished let’s start deploying the code 👇


Deploy the ML Pipeline

As a first step, we must install the gcloud GCP CLI tool to talk between our computer and the GCP VMs.

To authenticate, we will use the service account configured with admin rights for VMs and IAP access to SSH.

Now, we must tell the gcloud GCP CLI to use that service account.

To do so, you must create a key for your service account and download it as a JSON file. Same as you did for the buckets service accounts – here are some docs to refresh your mind [8].

After you download the file, you have to run the following gcloud command in your terminal:

gcloud auth activate-service-account [email protected] - key-file=/path/key.json - project=PROJECT_ID

Check out this doc for more details about the gcloud auth command.

Now whenever you run commands with gcloud, it will use this service account to authenticate.


Now let’s connect through SSH to the ml-pipeline GCP VM you created a few steps ahead:

gcloud compute ssh ml-pipeline - zone europe-west3-c - quiet - tunnel-through-iap - project <your-project-id>
  • NOTE 1: Change the zone if you haven’t created a VM within the same zone as us.
  • NOTE 2: Your project-id is NOT your project name. Go to your GCP projects list and find the project id.

Starting this point, if you configured the firewalls and service account correctly, as everything is Dockerized, all the steps will be 99% similar to those from the rest of the articles.

Check out the Github README – Set Up Additional Tools and Usage sections for step-by-step instructions.

You can follow the same steps while you are connected with SSH to the ml-pipeline GCP machine.

Note that the GCP machine is using Linux as its OS. Thus, you can directly copy & paste the commands from the README regardless of the OS you use on your local device.

Screenshot of connecting to the "app" VM using gcloud [Screenshot by the Author].
Screenshot of connecting to the "app" VM using gcloud [Screenshot by the Author].

You can safely repeat all the steps you’ve done setting The Pipeline locally using this SSH connection, but you have to keep in mind the following 3 edge cases:

#1. Clone the code in the home directory of the VM

Just SHH to the VM and run:

git clone https://github.com/iusztinpaul/energy-forecasting.git
cd energy-forecasting

#2. Install Docker using the following commands:

Install Docker:

sudo apt update
sudo apt install --yes apt-transport-https ca-certificates curl gnupg2 software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
sudo apt update
sudo apt install --yes docker-ce

Add sudo access to Docker:

sudo usermod -aG docker $USER
logout 

Login again to your machine:

gcloud compute ssh ml-pipeline --zone europe-west3-c --quiet --tunnel-through-iap --project <your-project-id>

Check out these docs for the full instructions [9].

#3. Replace all cp commands with gcloud compute scp:

This command will help you to copy files from your local machine to the VM.

For example, instead of running:

cp -r /path/to/admin/gcs/credentials/admin-buckets.json credentials/gcp/energy_consumption

Run in a different terminal (not the one connected with SSH to your VM):

gcloud compute scp --recurse --zone europe-west3-c --quiet --tunnel-through-iap --project <your-project-id> /local/path/to/admin-buckets.json ml-pipeline:~/energy-forecasting/airflow/dags/credentials/gcp/energy_consumption/

This command will copy your local admin-buckets.json file to the ml-pipeline VM.


After setting up your code on the ml-pipeline GCP VM, **** go to your VM view from GCP and the _Network tag_s section. There you will find the _External IP addres_s column, as shown in the image below. Copy that IP and attach port _808_0 to it.

For example, based on the External IP address from the image below, I accessed Airflow using this address: 35.207.134.188:8080.

Congrats! You connected to your own self-hosted Airflow application.

Note: If it doesn’t connect, give it a few seconds to load properly.

Screenshot of the "app" GCP VM configurations [Image by the Author].
Screenshot of the "app" GCP VM configurations [Image by the Author].

Deploy the Web App

Let’s connect using SSH to the "app" GCP VM you created a few steps ahead:

gcloud compute ssh app --zone europe-west3-c --quiet --tunnel-through-iap --project <your-project-id>
  • NOTE 1: Change the zone if you haven’t created a VM within the same zone as us.
  • NOTE 2: Your project-id is NOT your project name. Go to your GCP projects list and find the project id.

Here the process is similar to the one described in the "Deploy the ML Pipeline" section.

You can deploy the web app following the steps described in Lesson 6 or in the GitHub repository’s Set Up Additional Tools & Usage sections.

But don’t forget to keep in mind the 3 edge cases described in the "Deploy the ML Pipeline" section.

Please excuse me for referring you to so much external documentation on how to set up this stuff. The article is too long, and I didn’t want to replicate the GCP Google documentation here.


CI/CD Pipeline Using GitHub Actions (free)

The GitHub Actions YAML files are under the .github/workflows directory.

Firstly, let me explain the main components you have to know about a GitHub Actions file 👇

Using the "on -> push -> branches:" section, you specify which branch to listen to for events. In this case, the GitHub Action is triggered when new code is committed to the "main" branch.

In the "env: "section, you can declare the environment variables you need inside the script.

In the "jobs -> ci_cd -> steps:" section, you will declare the CI/CD pipeline steps, which will run sequentially.

In the "jobs -> ci_cd -> runs-on:" section, you specify the image of the VM you want the steps to run on.

Now, let’s take a look at some actual GitHub Action files 🔥

ML Pipeline GitHub Actions YAML file

The action will be triggered when new code is committed to the "main" branch, except for the web app directories and the YAML and Markdown files.

We added environment variables that contain information about the GCP project and VM.

As for the CI/CD steps, we mainly do 2 things:

  1. configure the credentials & authenticate to GCP,
  2. connect with SSH on the given GCP VM and run a command that: goes to the code directory, pulls the latest changes, builds the Python packages, and deploys them to the PyPi registry. Now Airflow will use the new Python packages the next time it runs.

Basically, it does what you would have done manually, but now, everything is nicely automated using GitHub Actions.

Note that you don’t have to remember or know how to write a GitHub Actions file from scratch, as you can find already written templates for most of the use cases. For example, here is the google-github-actions/ssh-compute [11] repository we used to write the YAML file below.

You will find similar templates for almost any use case you have in mind.

Web App GitHub Actions YAML file

The Web App actions file is 90% the same as the one used for the ML pipeline, except for the following:

  • we ignore the ML pipeline files;
  • we run a docker command that builds and runs the web app.

But where does the "${{ vars… }}" weird syntax come from? I will explain in just a sec, but what you have to know now is the following:

  • "${{ vars. }}":variables set inside GitHub;
  • "${{ secrets. }}": secrets set inside GitHub. Once a secret is set, you can’t see it anymore (the variables you can);
  • "${{ env. }}": environment variables set in the "env:" section.

    Important Observation

The YAML file above doesn’t contain the CI part, only the CD one.

To follow good practices for a robust CI pipeline, you should run an action that builds the Docker images and pushes them to a Docker registry.

Afterward, you would SSH to a testing environment and run your test suit. As a final step, you would SSH to the production VM, pull the images and run them.

The series got too long, and we wanted to keep it simple, but the good news is that you learned all the necessary tools and principles to do what we described above.


Set Secrets and Variables

At this point, you have to fork the energy-consumption repository to configure the GitHub Actions credentials with your own.

Check out this doc to see how to fork a repository on GitHub [10].

Set Actions Variables

Go to your forked repository. After click on: "Settings -> Secrets and variables -> Actions."

Now, click "Variables." You can create a new variable by clicking "New repository variable." See the image below 👇

Screenshot of how to create a new repository variable [Image by the Author].
Screenshot of how to create a new repository variable [Image by the Author].

You have to create 5 variables that the GitHub Actions scripts will use:

  • _APP_INSTANCE_NAME_: the name of the web app VM. In our case, it is called "app". The default should be OK if you use our recommended naming conventions.
  • GCLOUD_PROJECT: the ID of your GCP Project. Here, you have to change it with your project ID.
  • ML_PIPELINE_INSTANCE_NAME: the name of the ML pipeline VM. In our case, it is "ml-pipeline". The default should be OK if you use our recommended naming conventions.
  • USER: the user you used to connect to the VMs while settings up the machine using the SSH connection. Mine was "pauliusztin," but you must change it with yours. Go to the VM and run echo $USER .
  • ZONE: the zone where you deployed the VMs. The default should be OK if you use our recommended naming conventions.

Set Action Secrets

In the same "Secrets and variables/Actions" section, hit the "Secrets" tab.

You can create a new secret by pressing the "New repository secret" button.

These are similar to the variables we just completed, but after you fill in their values, you can’t see them anymore. That is why these are called secrets.

Here is where you add all your sensitive information. In our case, the GCP credentials and private keys. See the image below 👇

Screenshot of how to create a new repository secret [Image by the Author].
Screenshot of how to create a new repository secret [Image by the Author].

The _GCPCREDENTIALS secret contains the content of the JSON key of your VM admin service account. By settings this up, the CI/CD pipeline will use that service account to authenticate to the VMs.

Because the content of the file is in JSON format, to format it properly, you have to do the following steps:

Install the jq CLI tool:

sudo apt update
sudo apt install -y jq
jq - version

Format your JSON key file:

jq -c . /path/to/your/admin-vm.json

Take the output of this command and create your _GCPCREDENTIALS secret with it.

The _GCP_SSH_PRIVATEKEY is your GCP private SSH key (not your personal one – GCP creates an additional one automatically), which was created on your local computer when you used SSH to connect to the GCP VMs.

To copy it, run the following:

cd ~/.ssh
cat google_compute_engine

Copy the output from the terminal and create the _GCP_SSH_PRIVATEKEY variable.

Run the CI/CD Pipeline

Now make any change to the code, push it to the main branch, and the GitHub Actions files should trigger automatically.

Check your GitHub repository’s "Actions" tab to see their results.

Screenshot of the GitHub Actions running logs on GitHub [Image by the Author].
Screenshot of the GitHub Actions running logs on GitHub [Image by the Author].

Two actions will be triggered. One will build and deploy the ml-pipeline modules to your ml-pipeline GCP VM, and one will build and deploy the web app to your app GCP VM.


Conclusion

Congratulations! You finished the last lesson from the Full Stack 7-Steps MLOps Framework course. It means that now you are a full-stack ML engineer 🔥

I apologize again for the highly technical article. It isn’t a very entertaining read but a crucial step for finalizing this series.

In lesson 7, you learned how to:

  • manually deploy the 3 ML pipelines to GCP;
  • manually deploy the web app to GCP;
  • build a CI/CD pipeline to automate the deployment process using GitHub Actions.

Now that you understand how to add real business value by deploying your ML system and putting it to work, it is time to build your awesome ML project.

No project is perfectly built, and this one is no exception.

Thus, check out our bonus lesson of The Full Stack 7-Steps MLOps Framework course, where we will openly discuss other design choices we could have taken to improve the ML system built during this course.

I sincerely appreciate that you chose my course to learn MLE & MLOps✌️

Let’s connect on LinkedIn, and let me know if you have any questions or just share the awesome projects you built after this course.

Access the GitHub repository here.


💡 My goal is to help Machine Learning engineers level up in designing and productionizing ML systems. Follow me on LinkedIn or subscribe to my weekly newsletter for more insights!

🔥 If you enjoy reading articles like this and wish to support my writing, consider becoming a Medium member. Using my referral link, you can support me without extra cost while enjoying limitless access to Medium’s rich collection of stories.

Join Medium with my referral link – Paul Iusztin

Thank you ✌🏼 !


References

[1] Energy Consumption per DE35 Industry Code from Denmark API, Denmark Energy Data Service

[2] Using IAP for TCP forwarding, GCP Docs

[3] Overview of TCP forwarding, GCP Docs

[4] Google Cloud Collective, How to open a specific port such as 9090 in Google Compute Engine (2017), Stackoverflow

[5] ANTHONY HEDDINGS, How to Open Firewall Ports on a GCP Compute Engine Instance (2020), How-To Geek

[6] Preparing your project for IAP TCP forwarding, GCP Docs

[7] Reserve a static external IP address, GCP Docs

[8] Create and delete service account keys, GCP Docs

[9] Tom Roth, Install Docker on a Google Cloud virtual machine (2018), Tom Roth Blog

[10] Fork a repo, GitHub Docs

[11] GCP GitHub Actions Repository, GitHub


Related Articles