Kubeflow on a GCP’s VM with minikube

Beginner’s Step-by-Step guide to get Kubeflow running in a GCP’s VM with minikube

Santiago Velez Garcia
Towards Data Science

--

https://sp.depositphotos.com/stock-photos/timonel.html

Kubeflow’s goal is to make deployments of machine learning (ML) workflows on Kubernetes simple, portable, and scalable. The spirit is to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow. For a glymse at the official documentation go here.

It all sounds great, but it turns out Kubeflow while great is not that light therefore considerable resources are needed to get it up and running. Many of us do not have very powerful machines at home therefore we default to the cloud. We will use GCP making use of the free tier credit. The idea is to provide a step by step guide to get Kubeflow running so you can experiment with it at no cost and with relatively low time investment.

Wait a minute. You may be wondering why we are going to use minikube inside a VM, does not Google have a Kubernetes engine? Indeed it does, but the free tier does not include all required services, that is why we are resorting to the VM. If you happen to have a paid-account and do not mind the couple bucks for experimenting, you can deploy Kubeflow in GCP following the official guide.

The step by step guide includes the Virtual Machine set-up. If you already are proficient with this part or have chosen a different cloud provider, feel free to skip to the Kubeflow installation part.

SSH key pair creation

We want to securely access a VM from our local terminal. To do that, we will use SSH authentication. You can follow these steps or take a look at this video. First, open a local terminal. Start by declaring these environmental variables:

export PATH_TO_PRIVATE_KEY=~/.ssh/kf_key
export PATH_TO_PUBLIC_KEY=~/.ssh/kf_key.pub
export GCP_USERNAME="[REPLACE BY YOUR GCP USERNAME]"
  • PATH_TO_PRIVATE_KEY will be the path to your private SSH key file.
  • PATH_TO_PUBLIC_KEY will be the path to your public SSH key file.
  • GCP_USERNAME is your GCP username, it is the part before the add in your email. So if your email is my_username@gmal.com then your GCP_USERNAME would be my_username.

To generate the SSH key pair enter:

shh-keygen -t rsa -f $PATH_TO_PRIVATE_KEY -C "$GCP_USERNAME"

The output (do not copy) will be something like:

Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/USER/.ssh/kf_key
Your public key has been saved in /home/USER/.ssh/kf_key.pub
The key fingerprint is:
SHA256:hYWutzD/PxrpTougwoZ2PQ8AJ+Eod3XwIzj8mw3otII my_username
The key's randomart image is:
+---[RSA 3072]----+
| . o.... |
|o .. o o.o |
|o+..= ..+ . |
|..+. + ..o |
| .o o.S |
| . o..+=. . |
|Eo. +.+=..+ |
|..+o +..o+ o. |
|..... o..o*o.. |
+----[SHA256]-----+

GCP Virtual Machine Set-up

Kubeflow can run anywhere Kubernetes can run. That holds only if there are enough computational resources. For academic purposes, the suggested choice is to use GCP free credits to spin a Virtual Machine big enough to work comfortably.

Go to the GCP Console. On the navigation (hamburger) menu go to: Compute Engine >> VM instances >> Create Instance ([+] tab on top)

Give it a name:

name: kubeflow-instance

Machine configuration

e2-standard-8 
* 8 vCPUs
* 32 GB memory

It should look like:

Boot disk

*    100Gb disk
* debian-10-buster-v20201014

Firewall

Allow HTTP traffic
Allow HTTPS traffic

Now click on Management, security, disks, networking, sole tenancy, then go to Networking and under Network tags add:

kubeflow

SSH Keys

On your local terminal enter:

cat $PATH_TO_PUBLIC_KEY

Copy the public key, then go to Security and under SSH Keys, paste it in the prompt.

Click on Create. After a while, you should see your kubeflow_instance VM running (green check).

Copy the external EXTERNAL_IP to your clipboard and create a variable in your local terminal.

export EXTERNAL_IP=[your external_ip]

Firewall rules

Add a firewall rule go to VPC Network >> Firewall >> [+] CREATE FIREWALL RULE

name --> kf_firewall
For IP Ranges --> 0.0.0.0/0
Protocols and ports ---> all

This will allow us to access the Kubeflow dashboard via the VM external IP using our internet browser.

Connect to VM

On your local terminal connect to the VM

ssh -i $PATH_TO_PRIVATE_KEY $GCP_USERNAME@$EXTERNAL_IP

We have created and started the VM. We have as well, remote access to it. Follow the next steps to install Kubeflow.

Kubeflow installation

In order to install Kubeflow we will need a running Kubernetes cluster. For that, we will use minikube. minikube deploys a Kubernetes cluster within a Docker container. Therefore we will install docker first. All these steps happen within the VM that we access remotely.

Docker installation

For the official installation guide go here.

Update the apt package index and install packages to allow apt to use a repository over HTTPS

sudo apt-get install wget 
sudo apt update
sudo apt install --yes \
apt-transport-https \
ca-certificates \
curl \
gnupg2\
software-properties-common

Add Docker’s official GPG key

curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -

Set up the stable repository

sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"

Install Docker Engine

sudo apt update
sudo apt install --yes docker-cea

Then add your user to the docker group. Logout for the changes to take effect

sudo usermod -aG docker $USER
logout

And reconnect to the VM

ssh -i $PATH_TO_PRIVATE_KEY $GCP_USERNAME@$EXTERNAL_IP

Test docker

docker run busybox date

You should get an output (do not copy) similar to:

Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
9758c28807f2: Pull complete
Digest: sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d
Status: Downloaded newer image for busybox:latest
Fri Nov 20 16:12:53 UTC 2020

Authenticate user

gcloud auth configure-docker

kubectl installation

kubectl is the kubernetes command-line tool. It provides the means to interact with the kubernetes clusters. You can use kubectl to deploy applications, inspect and manage cluster resources, and view logs.

sudo apt-get update && sudo apt-get install -y apt-transport-https gnupg2 curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubectl=1.14.10-00

minikube installation

minikube is our tool of choice for running Kubernetes. In our case this will be in the VM instance we just created and configured. minikube runs a single-node Kubernetes cluster. It is good for trying out Kubernetes, or for daily development work. Remember, if you do not mind the cost you could have used Googles’s Kubernetes Engine.

curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube_latest_amd64.deb
sudo dpkg -i minikube_latest_amd64.deb

Create a minikube cluster

minikube start \
--cpus 5 \
--memory 10288 \
--disk-size 20gb \
--kubernetes-version 1.14.10

To restart minikube after the creation you just need minikube start. The output (do not copy) should look like this:

😄  minikube v1.15.1 on Debian 10.6
✨ Automatically selected the docker driver
👍 Starting control plane node minikube in cluster minikube
🚜 Pulling base image ...
🔥 Creating docker container (CPUs=5, Memory=10288MB) ...
🐳 Preparing Kubernetes v1.14.10 on Docker 19.03.13 ...
> kubectl.sha1: 41 B / 41 B [----------------------------] 100.00% ? p/s 0s
> kubeadm.sha1: 41 B / 41 B [----------------------------] 100.00% ? p/s 0s
> kubelet.sha1: 41 B / 41 B [----------------------------] 100.00% ? p/s 0s
> kubeadm: 37.77 MiB / 37.77 MiB [---------------] 100.00% 91.51 MiB p/s 1s
> kubectl: 41.12 MiB / 41.12 MiB [---------------] 100.00% 36.52 MiB p/s 2s
> kubelet: 122.18 MiB / 122.18 MiB [------------] 100.00% 103.13 MiB p/s 2s
🔎 Verifying Kubernetes components...
🌟 Enabled addons: storage-provisioner, default-storageclass
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Kubeflow installation

You will need the following script:

set -e

KF_PATH="$HOME/.kubeflow"
rm -fr $KF_PATH
mkdir -p $KF_PATH
cd $KF_PATH
wget https://github.com/kubeflow/kfctl/releases/download/v1.0/kfctl_v1.0-0-g94c35cf_linux.tar.gz -O kfctl_linux.tar.gz

tar -xvf kfctl_linux.tar.gz

export PATH=$PATH:$KF_PATH
export KF_NAME=my-kubeflow
export BASE_DIR=$KF_PATH
export KF_DIR=${BASE_DIR}/${KF_NAME}
export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.0.yaml"

mkdir -p ${KF_DIR}
cd ${KF_DIR}
kfctl apply -V -f ${CONFIG_URI}

You can run line by line or copy it all into a file kf-1.0.sh and run it.

chmod u+x kf-1.0.sh
./kf-1.0.sh

Check the deployment status

kubectl get pod -n kubeflow

Wait for the pods status to change to ‘running’. Be patient, this may take several minutes. Re-run the command to check. At the end should look similar to (do not copy):

NAME                                                           READY   STATUS             RESTARTS   AGE
admission-webhook-bootstrap-stateful-set-0 1/1 Running 0 8m41s
admission-webhook-deployment-64cb96ddbf-hxh8s 1/1 Running 0 7m56s
application-controller-stateful-set-0 1/1 Running 0 10m
argo-ui-778676df64-jthxl 1/1 Running 0 8m42s
centraldashboard-7dd7dd685d-kw8jk 1/1 Running 0 8m41s
jupyter-web-app-deployment-89789fd5-qclkj 1/1 Running 0 8m39s
katib-controller-6b789b6cb5-z4qrj 1/1 Running 1 8m27s
katib-db-manager-64f548b47c-h6nmt 1/1 Running 0 8m27s
katib-mysql-57884cb488-94lgn 1/1 Running 0 8m27s
katib-ui-5c5cc6bd77-czs8v 1/1 Running 0 8m27s
kfserving-controller-manager-0 2/2 Running 1 8m32s
metacontroller-0 1/1 Running 0 8m42s
metadata-db-76c9f78f77-4qhhg 1/1 Running 0 8m38s
metadata-deployment-674fdd976b-fwv8j 0/1 Running 0 8m38s
metadata-envoy-deployment-5688989bd6-tvcmr 1/1 Running 0 8m38s
metadata-grpc-deployment-5579bdc87b-8j5wt 0/1 CrashLoopBackOff 5 8m38s
metadata-ui-9b8cd699d-t7z4s 1/1 Running 0 8m38s
minio-755ff748b-vzjvk 1/1 Running 0 8m26s
ml-pipeline-79b4f85cbc-27f4j 1/1 Running 3 8m26s
ml-pipeline-ml-pipeline-visualizationserver-5fdffdc5bf-dkf7r 1/1 Running 0 8m24s
ml-pipeline-persistenceagent-645cb66874-wtmfz 1/1 Running 0 8m26s
ml-pipeline-scheduledworkflow-6c978b6b85-lwnfx 1/1 Running 0 8m24s
ml-pipeline-ui-6995b7bccf-cxkkm 1/1 Running 0 8m25s
ml-pipeline-viewer-controller-deployment-8554dc7b9f-qwvfm 1/1 Running 0 8m25s
mysql-598bc897dc-gfctv 1/1 Running 0 8m26s
notebook-controller-deployment-7db57b9ccf-9xbns 1/1 Running 0 8m37s
profiles-deployment-5d87dd4f87-59bsb 2/2 Running 0 8m24s
pytorch-operator-5fd5f94bdd-5cf62 1/1 Running 0 8m36s
seldon-controller-manager-679fc777cd-dfw2t 1/1 Running 0 8m23s
spark-operatorcrd-cleanup-jblpv 0/2 Completed 0 8m39s
spark-operatorsparkoperator-c7b64b87f-xhsxx 1/1 Running 0 8m39s
spartakus-volunteer-6b767c8d6-sq2wq 1/1 Running 0 8m31s
tensorboard-6544748d94-pccql 1/1 Running 0 8m30s
tf-job-operator-7d7c8fb8bb-mqtxz 1/1 Running 0 8m29s
workflow-controller-945c84565-kwd48 1/1 Running 0 8m42s

Port forwarding

Forward port to open Kubeflow’s dashboard

kubectl port-forward --address 0.0.0.0 -n istio-system svc/istio-ingressgateway 8080:80

If you feel like reading a bit more about Kubernetes port forwarding go to this resource.

The Dash-board

You can now open the Kubeflow dashboard in an external browser

http://EXTERNAL_IP:8080/

You should see the welcome prompt

Click on Start Setup, create a namespace:

There you have it the Kubeflow dash-board!!!

Now you can start experimenting with Kubeflow. If you have not look at Google’s introductory videos do so, they provide a good overview of what the tool is capable of. Experiment with the Jupyter Notebooks, do some hyperparameter tuning with Katib, deploy a pipeline with Pipelines, or deploy a model with KFServing.

Thanks to my peers Juan A. Londoño, Edwar Ortiz, Paulo Morillo, and Jorge Zafra for their contributions to this guide.

Let’s get in touch. LinkedIn, GitHub, Twitter.

--

--