How to create your own Deep Learning Project in Azure

Using Azure Storage, Databricks with Keras and Azure ML Service

René Bremer
Towards Data Science

--

1. Introduction

Deep learning has a lot of practical applications for companies such as image recognition, video indexing and speech to text transcription. However, it can be daunting for companies to start with deep learning projects. Common issues are sensitivity of data used and the complexity of deep learning, which can be seen as the superlative of machine learning.

2. Objective

In this tutorial, a sample deep learning project is created that is able to recognize classes of pictures using the CIFAR-10 dataset (plane, frog, ship). In this, the following steps are executed:

  • Azure Storage is used to securely store the pictures
  • Azure Databricks is used to train the model using Keras and TensorFlow
  • Azure ML Service is used to version and deploy model as HTTP endpoint

This can be depicted in the following architecture overview:

2. High level overview

Key in this solution is Azure Databricks, which is an Apache Spark-based analytics platform optimized for Azure. It natively integrates with other Azure services such as Azure Active Directory and Azure Storage. In the remainder of this blog, the following steps will be executed:

  • 3. Prerequisites
  • 4. Create Deep Learning project
  • 5. Deploy Deep Learning model
  • 6. Conclusion

It is a standalone tutorial in which the focus is to set up your own deep learning project in Azure, to “get your hands dirty” and so to get more familiar with subject. The focus is less on the internal working of deep learning, latest algorithms or computer vision APIs. In case you more interested in devops for AI, refer to my previous blogs, here and with focus on security, see here.

3. Prerequisites

The following resources need to be created in the same resource group and in the same location.

4. Create Deep Learning project

The following steps will be executed in this part.

  • 4a. Add pictures to storage account
  • 4b. Create deep learning cluster
  • 4c. Mount storage account
  • 4d. Train deep learning model on single node
  • 4e. Terminate deep learning cluster

4a. Add pictures to storage account

In this tutorial the CIFAR-10 data is used to train the deep learning model. In the first model, a subset of 2000 pictures will be used. Subsequently, a second model will be created that uses the full CIFAR-10 dataset of 60000 pictures.

To illustrate how to use Azure Storage in combination with Azure Databricks, the subset of 2000 pictures will be stored in the storage account. Therefore, go to the following URL to download a subset of 2000 pictures in a zip file.

https://github.com/rebremer/devopsai_databricks/raw/master/sampledata/2000pics_cifar10.zip

Subsequently, go to the Azure portal and select your storage account. Then select blobs and create a new container named “2000picscifar10”. Subsequently, upload in the zip file you downloaded earlier into your container.

4a1. Add zipfile to new container

Finally, go to Access Keys and copy the key or your storage account.

4a3. Copy access keys

4b. Create deep learning cluster

Go to your Azure Databricks workspace and go to Cluster. Since the model will be trained on the driver node without using Spark jobs, it is not needed to create (and pay for) worker nodes. Therefore, create a new GPU cluster with the following settings:

4b1. Create GPU cluster without worker nodes

4c. Mount storage account

Go to your Azure Databricks workspace, right-click and then select import. In the radio button, select to import the following notebook using URL:

https://raw.githubusercontent.com/rebremer/devopsai_databricks/master/project/modelling/0_mountStorage.py

See also picture below:

4c1. Import notebook to mount storage

Open your nodebook and change the following settings

storageaccount="<<your_storage_account>>"
account_key="<<your_key_in_step4a>>"
containername="2000picscifar10" # change if you used another name

Notice that keys shall never be stored in a notebook in a production situation. Instead, secret scope shall be used, see also my blog how to embed security in data science projects. Then attach notebook to cluster you created and then click SHIFT+ENTER to run to it cell by cell.

4c2. Attach notebook to cluster

In this notebook the following steps will be excuted:

  • Mount storage account to Azure Databricks Workspace
  • Unzip pictures in storage account
  • List and show pictures

4d. Train deep learning model on single node

Go to your Azure Databricks workspace again, right-click and then select import. In the radio button, select to import the following notebook using URL:

https://github.com/rebremer/devopsai_databricks/blob/master/project/modelling/1_DeepLearningCifar10NotebookExploration.py

In this notebook the following steps will be excuted:

  • Import and process data from storage account
  • Build model on 2000 pictures in storage account
  • Build model on dataset of 60000 pictures of full CIFAR-10 dataset
  • Save model to disk (dbfs)

When you run the notebook succesfully, you can see an overview of the predictions (red are wrong predictions).

4d1. Overview of predictions

4e. Terminate deep learning cluster

Running GPU clusters can be costly. Since we do not need the GPU cluster in the remaining of this tutorial, we can stop it. Therefore, go to your cluster and select terminate.

4e1. Terminate cluster

5. Deploy Deep Learning project

The following steps will be executed in this part.

  • 5a. Create new cluster in Databricks
  • 5b. Add libraries to cluster
  • 5c. Register model and log metrics
  • 5d. Create an HTTP endpoint of model
  • 5e. Test HTTP endpoint of model

5a. Create new cluster in Databricks

In this step, a new cluster is created that will be used to deploy our model. Since the model is already trained, we don’t need GPUs anymore. Select a cluster with the following settings

5a1. Non GPU cluster

5b. Add libraries to cluster

To deploy our model, we need a couple of libraries. Go to your shared folder, right click in the shared and select to “create library”.

4b1. Add Libraries

Subsequently, select Pypi and add the following libraries to shared folder:

azureml-sdk[databricks]
keras
tensorflow

5c. Register model and log metrics

In this step, the model and its metrics wil be added to your Azure ML service workspace. Import the following notebook in your Azure Databricks workspace.

https://raw.githubusercontent.com/rebremer/devopsai_databricks/master/project/modelling/2a_Cifar10KerasNotebookLogModel.py

Open your nodebook and change the following settings of your Azure ML service workspace (subscription id can be found in the overview tab of you Azure ML service workspace instance).

workspace="<<your_name_of_azure_ml_service_workspace>>"
resource_grp="<<your_resource_group_amlservice>>"
subscription_id="<<your_subscriptionid_amlservice>>"

Then run the notebook, again using Shift+Enter to go through it cell by cell. In this notebook the following steps will be excuted:

  • Log metrics of model that was trained on 2000 pictures
  • Log metrics of model that was trained on all 60000 pictures
  • Register best model to deployed in step 5d

To view the metrics, go the portal, select Azure Machine Learning Workspace and then Expirement, see below

5c1. Log metrics of model with 2000 pictures and all CIFAR-10 pictures

5d. Create an HTTP endpoint of model

Import the following notebook in your workspace

https://raw.githubusercontent.com/rebremer/devopsai_databricks/master/project/modelling/2b_Cifar10KerasNotebookDeployModel.py

Again, change the parameters similar as in step 5c. In this notebook, an endpoint is created of the model that was trained on all pictures. When you go to portal, select Azure Container Instance, you will find the IP adress of the endpoint.

5d1. Azure Container Instance with IP addresss

5e. Test HTTP endpoint of model

To test the endpoint, the following steps will be done:

  • Get a random png from the internet in one of the CIFAR-10 categories
  • Convert png to base64 encoding using this website
  • Send base64 payload with a tool like Postman to create predictions.

The following picture shows a picture of a ship, converted to base64 and sent with Postman to create prediction using endpoint.

5e1. Picture of ship converted to base64 using https://onlinepngtools.com/convert-png-to-base64
5e2. Prediction of ship using HTTP endpoint

6. Conclusion

In this tutorial, a deep learning project was created in which the following services were used:

  • Azure Storage account to securely store the pictures
  • Azure Databricks with Tensorflow and Keras to build the model
  • Azure ML Service to keep track of themodel and create an HTTP endpoint

Deep learning has a lot of practical applications for enterprises. However, it can be daunting for enterprises to start with deep learning projects. Creating your own sample project and “getting you hands dirty” is a great way to learn and to get more familiar with the subject.

6. Sample project overview

Finally, big thanks to my colleague Ilona Stuhler who was so kind to provide me crucial insides in the subject and to donate her project to me.

--

--

Data Solution Architect @ Microsoft, working with Azure services as ADFv2, ADLSgen2, Azure DevOps, Databricks, Function Apps and SQL. Opinions here are mine.