
In the daily routine of a data scientist, processing and computing are a going concern when dealing with growing datasets and more complex model architecture. However, I am not counting the fact that you get the budget to update the latest hardware with better memory, CPUs, and GPUs. It seems almost inevitable that data scientists will seek out solutions to these issues. One solution, which is way cheaper than upgrading the hardware, is through cloud VM.
Virtual Machines are cloud services that allow customized configuration on the type of computing capacity as needed. Simply put, it is a cloud computer that you can upgrade when you need more computing power and memory for heavy-duty tasks. Yet usually VMs does not provide the ease of usability like our native operating systems, and there is a bit of a learning curve if you want to walk your way around the Linux systems. So this tutorial will offer a solution to connect your VMs to your local SDE like VSCode while setting up Github to your VM to allow visible management. Without further ado, let’s dig in!
Step 1: Setting up an Instance on Cloud (GCP)
In this tutorial, the cloud service I will be using is Google Cloud Platform (GCP), which I do enjoy its free credit for the start.
- Acquire an account in GCP, and log in to the console.
- Enable Compute Engine API, it should be enabled by default.
- Navigate yourself in the console to Compute Engine, and click create under Vm instances.

- Select instance name, region, and machine configuration. May also check how much does it cost on your selection in the sidebar. I am choosing the cheapest since I only want to demo how it works.

- Setup a boot disk for the instance. I am using ubuntu 16.04 LTS, yet you can choose a variety of public images available. For firewalls, enable HTTP/HTTPS traffic.

Step 2: Connecting VM from VSCode
There are two key parts in this step: creating an SSH key file and use Remote SSH extension in VSCode to connect you to the VM. To do this, we need to use a third-party connection method which is outlined in this guide. And I will also walk through the process of setting it up with VSCode.
1.Install gcloud
To install gcloud, you can use the following command on your terminal. Or refer to the official guide.
2. Auth login with gcloud
Once it is set up, log in with a new terminal window in VSCode
gcloud auth login
It will direct you to a sign-in page. Sign in and allow SDK access and you may see the sign-in message in the terminal. If it shows the project info missing, config the project id with gcloud config set project <Project_ID>
. The project id can be found in your console.

3. Enable OS-login in gcloud
To enable the OS-login, it is basically adding metadata in either the project or the VM instance you are using. Check this page for more information. But since we have login with terminal, use the following command to set up quickly.
gcloud compute project-info add-metadata --metadata enable-oslogin=TRUE
4. Create an SSH key pair in the terminal
There are many ways to generate a key pair. For Mac/Linux, use the following command in the terminal:
ssh-keygen -t rsa
You will see something like the following. A passphrase is optional.

5. Add the SSH key pair to a user account by gcloud
Now that we have the SSH key pair generated, we need to add the key to the user account. Using the following command with gcloud in your terminal:
gcloud compute os-login ssh-keys add
--key-file <key-file-path>
--ttl <expire-time>
<key-file-path>
is your public key file ends with .pub
.
<expire-time>
could be any of the following. Make sure you have numbers before it, e.g. 10d
for an expire-time set to 10 days.
s
for secondsm
for minutesh
for hoursd
for days- Set the value to
0
to indicate no expiration time.
The output of this command will look similar to the following. The key info here is your username. Save it somewhere in the notepad and you will use it later for connection.

6. Connect to the host by using the key pair
Before connecting to your instance directly from VSCode, it is recommended to test the connection using ssh
command in the terminal.
ssh -i <PATH_TO_PRIVATE_KEY> <USERNAME>@<EXTERNAL_IP>
<PATH_TO_PRIVATE_KEY>
: We have generated this file in the previous step, where the file path looks like/user/.ssh/id_rsa
<USERNAME>
: This is the username we obtained in the output.<EXTERNAL_IP>
: Locate this info in the console.

Once the connection is successful, you will see the VM terminal looks like this:

At this point, we can proceed to connect with VSCode. Use exit
to leave the remote machine.
7. Use Remote-SSH to Connect VM
Remote-SSH is an extension for VSCode to connect to a remote machine with VSCode. You can easily search and install it in the extension tab.

Click the green icon in the bottom left of the window (or press F1 and search for Remote-ssh). Choose Connect to host.

And Add a New SSH Host

Now paste the exact ssh
command you use for the test connection, then press Enter.

And VSCode will open up a new window with remote connection. And you have the VSCode connected to the GCP instance.

Step 3: Setup SSH connection with Github
Now that you have the VM backed in the VSCode, we can start working on some projects in Github. To integrate the workflow seamlessly with Github, we can set up an SSH connection to Github to enable all the git command. Github is offering a comprehensive guide on how to set it up. But still, I will walk through the process in this tutorial.
As we mentioned previously, we need a key pair with Github as well. Open up a new terminal inside VSCode, and generate an SSH key pair with the command as follows. But substitute the email with the one you register with Github.
ssh-keygen -t id_rsa -C "[email protected]"
Now that we have a ssh
key files, we need to add it to the SSH-Agent. To start it in the background, use the following:
eval "$(ssh-agent -s)"
>> Agent pid 59566
Now add the private key to the agent:
ssh-add ~/.ssh/id_rsa
We need also to add the key in GitHub as well. Login to Github with your account and find SSH Keys and GPG Keys under settings. Click New SSH key as shown in the snapshot.

In the terminal, locate the public key file and copy it.
vi ~/.ssh/id_rsa.pub
Now paste and create the SSH keys.

Test the GitHub connection in the terminal. Paste the following command:
ssh -T [email protected]
The first-time connection will see something like this. Just continue to set it up.
> The authenticity of host 'github.com (IP ADDRESS)' can't be established.
> RSA key fingerprint is SHA256:xxxxx.
> Are you sure you want to continue connecting (yes/no)?
Now you have it there with an SSH connection with Github. Use git clone
your repo in Github to the VM. Use code .
and you will have a new workspace opened up for your repo.

Now you have the exact same experience you will have as if you are working on your local computer!

Useful Resources for Your Instance
Here are some useful resources to install to the Linux instance you have setup with VSCode
Thanks for following along! Enjoy 🥳