Azure Machine Learning is a cloud service for accelerating and managing the machine learning project lifecycle. It enables you to create models or use a model built from an open-source platform, such as Pytorch, TensorFlow, or scikit-learn. Azure ML is complimented with additional MLOps tools, which help you monitor, retrain, and redeploy models.
In this blogpost I want to demonstrate how to build, train and deploy a model using an Azure Machine Learning notebook and compute instance for development, a cluster for training, and a container website for production. For this demo we will use the Titanic passenger data (Titanic.csv), predicting who will survive using a Random Forest model.
Setting up the environment
If you have an Azure account, you can go to the Azure Portal and create a Machine Learning Workspace. You can use the search bar to find Machine Learning and hit the button create.

Next you need to fill in the project details, such as your subscription ID, resource group name and workspace name. You can go for the basics, so hit ‘review + create’.

After deployment you can examine the outcome of all the newly created resources by using the resource group overview pane.

The Azure Machine Learning service comes with a default storage account. For our demo, you will use this default storage account for making our Titanic data available. Click on the Storage Account, Click on the default azureml-blobstore-xxx container, and upload your Titanic.csv data by using the upload button.

For this demo you will also need to use the Storage Account access keys. Hit back and go to the default storage account. On the left click on ‘Access keys’, click on ‘Show keys’, and copy the key information to a secure location for later usage.
Service Principal Authentication
For setting up a machine learning workflow as an automated process, I recommend using Service Principal Authentication. This approach decouples the authentication from any specific user login, and allows managed access control.
The first step is to create a service principal. First, select Azure Active Directory and App Registrations. Then select +New application, give your service principal a name, for example my-svp-machine-learning. You can leave other parameters as is.

From the page for your newly created service principal, copy the Application ID and Tenant ID as they are needed later. Then select Certificates & secrets, and +New client secret write a description for your key, and select duration. Then click Add, and copy the value of client secret to a secure location for later.

Finally, you need to give the service principal permissions to access your Azure Ml workspace. Navigate to Resource Groups, to the resource group for your Machine Learning Workspace. Then select Access Control (IAM) and Add a role assignment. For Role, specify which level of access you need to grant, for example Contributor. Start entering your service principal name and once it is found, select it, and click Save.

Azure Machine Learning
After setting everything up, we’re good to go. Hover back to the Machine Learning service and launch the Studio.

After launching the studio environment you will see an overview screen with many boxes and controls. Let me give you a high-level understanding of the components and how they work together to assist in the process of building, deploying, and maintaining machine learning models.

- A workspace is the centralized place which brings together all services and platform components.
- A compute target is any machine or set of machines (cluster) you use to run your training script or host your service deployment.
- Datasets make it easier to access and work with your Data. By creating a dataset, you create a reference to the data source location along with a copy of its metadata.
- Environments are the encapsulation of the environment where training or scoring of your machine learning model happens.
- An experiment is a grouping of many runs from a specified script. It always belongs to a workspace. When you submit a run, you provide an experiment name.
- A run is a single execution of a training script. An experiment will typically contain multiple runs.
- A notebook is used to write and run your own code in integrated Jupyter notebook servers.
Setting up a compute instance
For developing new models you need to have a compute instance, which is a container that includes multiple tools and environments installed for Machine Learning. The primary use of a compute instance is for your development workstation. This compute instance can also be used as a compute target for training and inferencing jobs. On the left, click on compute and create a new compute instance. You can leave the default settings for now, so click ‘Create’ to create your first compute instance.

Building your first notebook
Let’s continue our journey by developing our Titanic survival prediction model. You will do this by using Jupyter notebooks. Click on Notebooks and ‘Create new’ and choose building a new notebook. The folder and file structure on the left will be similar to what you see in the image below. Click on +New. In my case I entered Titanic.ipny for my first notebook, but you can choose any name. Please note this is for development and testing purposes first.

For the model itself I used the Predicting the Survival of Titanic Passengers article from towardsdatascience.com. I enhanced the code with additional configuration to tightly integrate the process with Azure ML. Let’s walk thru the script line by line:
# importing necessary libraries
from azureml.core import Workspace, Datastore, Dataset, Experiment, Model
from azureml.data.dataset_factory import DataType
# importing sklearn libraries
import sklearn
from sklearn import linear_model
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import MinMaxScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn import preprocessing
# Useful for good split of data into train and test
from sklearn.model_selection import train_test_split
# import pandas
import pandas as pd
# linear algebra
import numpy as np
# import re package
import re
# import joblib
import joblib
# import seaborn
import seaborn as sns
# import matplotlib
%matplotlib inline
from matplotlib import pyplot as plt
The first block of code is required for importing all necessary libraries. The azureml.core packages are needed for communication with our Azure ML Workspace. The remainder packages, such as sklearn, are needed for our model.
# get existing workspace
ws = Workspace.from_config()
# set connection string settings
blob_datastore_name='machinelearnin9342837683'
container_name=os.getenv("BLOB_CONTAINER", "azureml-blobstore-867fb2e1-6f13-4490-bc45-291d19912ec0")
account_name=os.getenv("BLOB_ACCOUNTNAME", "machinelearnin9763237684")
account_key=os.getenv("BLOB_ACCOUNT_KEY", "ExMnf3yB6usdSLi96wo53MMDA/QX5E6WnccJHAq1ECawHDDb5WI1ATw9UUqS3lgHQm69oKfNwWIrUtlSXZ1RQA==")
# register blob storage account within AMLS
datastore = Datastore.register_azure_blob_container(workspace=ws,
datastore_name=blob_datastore_name,
container_name=container_name,
account_name=account_name,
account_key=account_key,
overwrite=True)
# connect to the azure blob storage account
try:
datastore = Datastore.get(ws, blob_datastore_name)
print("Found Blob Datastore with name: %s" % blob_datastore_name)
except UserErrorException:
datastore = Datastore.register_azure_blob_container(
workspace=ws,
datastore_name=blob_datastore_name,
account_name=account_name, # Storage account name
container_name=container_name, # Name of Azure blob container
account_key=account_key,
protocol=http) # Storage account key
print("Registered blob datastore with name: %s" % blob_datastore_name)
# attach Titanic.csv file
dataset = Dataset.Tabular.from_delimited_files(path=[(datastore, 'Titanic.csv')])
The next block of code is needed for accessing our workspace and retrieving the Titanic data from your storage account and container. The credentials in this script are used to register the storage account as a new datastore (copy paste the access key from your secure location). Once connected you can remove these line items, because the Azure ML will handle this further. This allows you to not put any credentials in your script. For now, let’s leave it this way.
# register Dataset as version 1
dataset.register(workspace = ws, name = 'titanic', create_new_version = True)
The next line items are about registering this dataset as version 1.
experiment = Experiment(ws, "TitanicExperiment")
run = experiment.start_logging(outputs=None, snapshot_directory=".")
The next step is that you will register your Titanic Experiment. This is useful because from experiments you can oversee all used input datasets, train and test data, results, logging information, and so on. Let’s jump over to the machine learning model itself.
# convert dataset to pandas dataframe
titanic_ds = dataset.to_pandas_dataframe()
print("Examine titanic dataset")
titanic_ds.info()
print("Show first records")
titanic_ds.head(10)
# convert 'Sex' feature into numeric
genders = {"male": 0, "female": 1}
data = [titanic_ds]
for dataset in data:
dataset['Sex'] = dataset['Sex'].map(genders)
# since the most common port is Southampton the chances are that the missing one is from there
titanic_ds['Embarked'].fillna(value='S', inplace=True)
# convert 'Embarked' feature into numeric
ports = {"S": 0, "C": 1, "Q": 2}
data = [titanic_ds]
for dataset in data:
dataset['Embarked'] = dataset['Embarked'].map(ports)
# convert 'Survived' feature into numeric
ports = {False: 0, True: 1}
data = [titanic_ds]
for dataset in data:
dataset['Survived'] = dataset['Survived'].map(ports)
# a cabin number looks like 'C123' and the letter refers to the deck.
# therefore we're going to extract these and create a new feature, that contains a persons deck.
deck = {"A": 1, "B": 2, "C": 3, "D": 4, "E": 5, "F": 6, "G": 7, "U": 8}
data = [titanic_ds]
for dataset in data:
dataset['Cabin'] = dataset['Cabin'].fillna("U0")
dataset['Deck'] = dataset['Cabin'].map(lambda x: re.compile("([a-zA-Z]+)").search(x).group())
dataset['Deck'] = dataset['Deck'].map(deck)
dataset['Deck'] = dataset['Deck'].fillna(0)
dataset['Deck'] = dataset['Deck'].astype(int)
# drop cabin since we have a deck feature
titanic_ds = titanic_ds.drop(['Cabin'], axis=1)
# fix age features missing values
data = [titanic_ds]
for dataset in data:
mean = titanic_ds["Age"].mean()
std = titanic_ds["Age"].std()
is_null = dataset["Age"].isnull().sum()
# compute random numbers between the mean, std and is_null
rand_age = np.random.randint(mean - std, mean + std, size = is_null)
# fill NaN values in Age column with random values generated
age_slice = dataset["Age"].copy()
age_slice[np.isnan(age_slice)] = rand_age
dataset["Age"] = age_slice
dataset["Age"] = titanic_ds["Age"].astype(int)
# convert 'age' to a feature holding a category
data = [titanic_ds]
for dataset in data:
dataset['Age'] = dataset['Age'].astype(int)
dataset.loc[ dataset['Age'] <= 11, 'Age'] = 0
dataset.loc[(dataset['Age'] > 11) & (dataset['Age'] <= 18), 'Age'] = 1
dataset.loc[(dataset['Age'] > 18) & (dataset['Age'] <= 22), 'Age'] = 2
dataset.loc[(dataset['Age'] > 22) & (dataset['Age'] <= 27), 'Age'] = 3
dataset.loc[(dataset['Age'] > 27) & (dataset['Age'] <= 33), 'Age'] = 4
dataset.loc[(dataset['Age'] > 33) & (dataset['Age'] <= 40), 'Age'] = 5
dataset.loc[(dataset['Age'] > 40) & (dataset['Age'] <= 66), 'Age'] = 6
dataset.loc[ dataset['Age'] > 66, 'Age'] = 6
# create titles
data = [titanic_ds]
titles = {"Mr": 1, "Miss": 2, "Mrs": 3, "Master": 4, "Rare": 5}
for dataset in data:
# extract titles
dataset['Title'] = dataset.Name.str.extract(' ([A-Za-z]+).', expand=False)
# replace titles with a more common title or as Rare
dataset['Title'] = dataset['Title'].replace(['Lady', 'Countess','Capt', 'Col','Don', 'Dr',
'Major', 'Rev', 'Sir', 'Jonkheer', 'Dona'], 'Rare')
dataset['Title'] = dataset['Title'].replace('Mlle', 'Miss')
dataset['Title'] = dataset['Title'].replace('Ms', 'Miss')
dataset['Title'] = dataset['Title'].replace('Mme', 'Mrs')
# convert titles into numbers
dataset['Title'] = dataset['Title'].map(titles)
# filling NaN with 0, to get safe
dataset['Title'] = dataset['Title'].fillna(0)
# drop name and title column since we have create a title
titanic_ds = titanic_ds.drop(['Name','Ticket'], axis=1)
# default missing fare rates
titanic_ds['Fare'].fillna(value=titanic_ds.Fare.mean(), inplace=True)
# convert fare to a feature holding a category
data = [titanic_ds]
for dataset in data:
dataset.loc[ dataset['Fare'] <= 7.91, 'Fare'] = 0
dataset.loc[(dataset['Fare'] > 7.91) & (dataset['Fare'] <= 14.454), 'Fare'] = 1
dataset.loc[(dataset['Fare'] > 14.454) & (dataset['Fare'] <= 31), 'Fare'] = 2
dataset.loc[(dataset['Fare'] > 31) & (dataset['Fare'] <= 99), 'Fare'] = 3
dataset.loc[(dataset['Fare'] > 99) & (dataset['Fare'] <= 250), 'Fare'] = 4
dataset.loc[ dataset['Fare'] > 250, 'Fare'] = 5
dataset['Fare'] = dataset['Fare'].astype(int)
# create not_alone and relatives features
data = [titanic_ds]
for dataset in data:
dataset['relatives'] = dataset['SibSp'] + dataset['Parch']
dataset.loc[dataset['relatives'] > 0, 'not_alone'] = 0
dataset.loc[dataset['relatives'] == 0, 'not_alone'] = 1
dataset['not_alone'] = dataset['not_alone'].astype(int)
# create age class
data = [titanic_ds]
for dataset in data:
dataset['Age_Class']= dataset['Age']* dataset['Pclass']
# create fare per person
data = [titanic_ds]
for dataset in data:
dataset['Fare_Per_Person'] = dataset['Fare']/(dataset['relatives']+1)
dataset['Fare_Per_Person'] = dataset['Fare_Per_Person'].astype(int)
# convert all data to numbers
le = preprocessing.LabelEncoder()
titanic_ds=titanic_ds.apply(le.fit_transform)
print("Show first records of all the features created")
titanic_ds.head(10)
In this demo we’re going to create a model to predict who would have survived (or died) during the sinking of the Titanic. The large block provides all the code for preparing our data. In summary you converted a lot of features into numeric and aggregated data for improving our accuracy of the model. For example, you converted the Age into a category and compiled the titles from the names. Additionally you fixed missing values and derived data for additional features. The last line shows the first 10 rows of all features created.
# convert all data to numbers
le = preprocessing.LabelEncoder()
titanic_ds=titanic_ds.apply(le.fit_transform)
# split our data into a test (30%) and train (70%) dataset
test_data_split = 0.30
msk = np.random.rand(len(titanic_ds)) < test_data_split
test = titanic_ds[msk]
train = titanic_ds[~msk]
# drop 'PassengerId' from the train set, because it does not contribute to a persons survival probability
train = train.drop(['PassengerId'], axis=1)
X_train, X_test, Y_train, Y_test = train_test_split(train.drop("Survived", axis=1), train["Survived"],test_size=0.4,random_state=54,shuffle=True)
The next lines ensure that all data is converted to numbers. Next you split our full dataset into a test and train dataset. In our case all passenger information is combined into one large dataset. Next the data is split in data for training and validating your model. Conceptually you can see what we’ve done below:

# save data
np.savetxt('download/train.csv', train, delimiter=',')
np.savetxt('download/test.csv', test, delimiter=',')
np.savetxt('download/X_train.csv', X_train, delimiter=',')
np.savetxt('download/Y_train.csv', X_train, delimiter=',')
np.savetxt('download/X_test.csv', X_train, delimiter=',')
np.savetxt('download/Y_test.csv', X_train, delimiter=',')
# upload data to blob storage account
datastore.upload_files(files=['download/train.csv', 'download/test.csv', 'download/X_train.csv', 'download/Y_train.csv', 'download/X_test.csv', 'download/Y_test.csv'],
target_path='titanic_data/',
overwrite=True)
# attach all datasets
dataset_train = Dataset.Tabular.from_delimited_files(path=[(datastore, 'titanic_data/train.csv')])
dataset_test = Dataset.Tabular.from_delimited_files(path=[(datastore, 'titanic_data/test.csv')])
dataset_X_train = Dataset.Tabular.from_delimited_files(path=[(datastore, 'titanic_data/X_train.csv')])
dataset_Y_train = Dataset.Tabular.from_delimited_files(path=[(datastore, 'titanic_data/Y_train.csv')])
dataset_X_test = Dataset.Tabular.from_delimited_files(path=[(datastore, 'titanic_data/X_test.csv')])
dataset_Y_test= Dataset.Tabular.from_delimited_files(path=[(datastore, 'titanic_data/Y_test.csv')])
# register datasets as version 1
dataset_train.register(workspace = ws, name = 'train', create_new_version = True)
dataset_test.register(workspace = ws, name = 'test', create_new_version = True)
dataset_X_train.register(workspace = ws, name = 'X_train', create_new_version = True)
dataset_Y_train.register(workspace = ws, name = 'Y_train', create_new_version = True)
dataset_X_test.register(workspace = ws, name = 'X_test', create_new_version = True)
dataset_Y_test.register(workspace = ws, name = 'Y_test', create_new_version = True)
For storing and versioning all our data within Azure ML you need to use the following lines of code. At a later stage you will register and link your versioned model to this versioned data, including all other metadata, such as performance metrics, accuracy, etc. This is important for showing evidence and staying in control.
# Random Forest
random_forest = RandomForestClassifier(n_estimators=100)
random_forest.fit(X_train, Y_train)
# Save model as pickle file
joblib.dump(random_forest, "outputs/random_forest.pkl")
# Predict and get result
Y_prediction = random_forest.predict(X_test)
random_forest.score(X_train, Y_train)
acc_random_forest = round(random_forest.score(X_train, Y_train) * 100, 2)
# show the important features for the classification
feature_imp = pd.Series(random_forest.feature_importances_, index=X_train.columns).sort_values(ascending=False)
plt.figure(figsize=(10,6))
sns.barplot(x=feature_imp, y=feature_imp.index)
# Add labels to your graph
plt.xlabel('Feature Importance Score')
plt.ylabel('Features')
plt.title("Visualizing Important Features")
plt.tight_layout()
Next you will use a Random Forest model, which is a supervised machine learning algorithm. It creates a forest of decision trees and makes it somewhat random. Additionally you see a plot showing the most dominant features within the model.
# register model within workspace
model_random_forest = Model.register(workspace=ws,
model_name='random_forest',
model_path='outputs/random_forest.pkl',
model_framework=Model.Framework.SCIKITLEARN,
model_framework_version=sklearn.__version__,
sample_input_dataset=dataset_X_train,
sample_output_dataset=dataset_Y_train,
description='Titanic survival prediction using random forest.',
datasets = [('train',dataset_train),('test',dataset_test),('X_train',dataset_X_train),('Y_train',dataset_Y_train),('X_test',dataset_X_test),('Y_test',dataset_Y_test)],
tags={'type': 'regression'})
With the lines of code above you will register your model within Azure ML. The model itself is serialized as a pickle file, which means it has been dumped for later usage. Additionally you linked the model to our training datasets.
# log plot image
run.log_image('Feature Importance Score', plot=plt)
# complete run
run.log("Random Forest accuracy", acc_random_forest)
run.complete()
Lastly you log the accuracy of our model and mark the run as completed.
MLOps capabilities
In a typical development flow, developing a model is just the first step. The biggest effort goes into making everything production-ready. This includes data collection, data prep, training, serving and monitoring. You just executed these first steps and stored all the related metadata information within Azure ML. For example, you can see the plot image below, which and other information of our model, such as experiments, versioned datasets, logging information, and so on.

The big benefit is that you can correlate models and experiments to reliable and reproducible results. Team members can easily observe, explain, and improve model behavior and accuracy, because all information is bundled together within a central place.
Scaling up with compute clusters
The next step is to train the model on a cluster, which is useful for compute intensive processing. For example, when using large datasets or complex computations. The work can be distributed over several machines. Within the compute section you can create and configure your clusters. For this demonstration you can use the standard configuration by clicking ‘Next’.

For submitting jobs to our cluster you need to setup some additional scripts. The first script is called a setup.sh and is needed for installing required packages.
#!/bin/bash
pip install azureml-sdk[notebooks]
pip install azureml-core
pip install azure-storage-blob
pip install joblib
The second script is called script.py which interacts with our Azure ML workspace for submitting our jobs. Let’s examine the code block below.
from azureml.core import ScriptRunConfig, Experiment
from azureml.core import Workspace
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core import Environment
from azureml.widgets import RunDetails
from azureml.core.authentication import ServicePrincipalAuthentication
svc_pr = ServicePrincipalAuthentication(
tenant_id="72f988bf-FDf1-41af-9Ab-2d7cd011Ab47",
service_principal_id="ef861c6f-b0FA-4895-8330-9b90Aea3bd1",
service_principal_password="~_V846CU~R36tP.UV4Sz78TtUAaeyF40C")
ws = Workspace(
subscription_id="3466fg5d-afda-4533-b3e2-c498g86a45aa",
resource_group="rg-machinelearning",
workspace_name="machinelearning",
auth=svc_pr
)
# create or load an experiment
experiment = Experiment(ws, 'TitanicExperiment')
# create or retrieve a compute target
cluster = ws.compute_targets['cluster1']
# create or retrieve an environment
env = Environment.get(ws, name='AzureML-sklearn-0.24.1-ubuntu18.04-py37-cpu-inference')
# configure and submit your training run
src = ScriptRunConfig(source_directory='.',
command=['bash setup.sh && python train.py'],
compute_target=cluster,
environment=env)
run = experiment.submit(config=src)
run
Do you remember the service principle account? The service principle account information comes from the account information you used before. This allows your clusters to interact with the Azure ML. Another important aspect is the inference. I’m using a prebuilt Docker container image with the most popular machine learning frameworks and Python packages.
Another important aspect is the command argument. I’m using this to pass in both the setup.sh and train.py scripts. The setup.sh script installs all the required packages. The train.py refers in my example machine learning model script, which holds the same code as the Titanic.ipny. After storing these script and running the script.py a job should be submitted to the cluster, like the image below.

The big benefit is that you can easily scale up. The transition from a model, developed on a compute instance, to a high-performing elastic cluster is relatively easy. It’s just a matter of wrapping your experiment using another (small) script.
Operationalizing your model using Azure ML Endpoints
The final step is operationalizing your model, which is in our case deploying the model as a web-service. Within Azure ML you can either deploy a real-time or batch service. To deploy your model as a real-time web-service, use the following code below:
# set service name
service_name = 'titanic-service'
service = Model.deploy(ws, service_name, [model_random_forest], overwrite=True)
service.wait_for_deployment(show_output=True)
The titanic-service refers to the service name. The model_random_forest in our case refers to the model you registered within our workspace. After successful deployment you should see your service under the Endpoints section:

This REST endpoint can be integrated with any other system. Let’s validate the service by using postman.

I’m making an API call by submitting 13 data attributes. The second and third attributes are Sex and Age. The numbers 1 and 1 correspond with female with an age below or equals 11. After submitting our request you can see that the probability of surviving Titanic is 88.5%.

If you would change the Sex and Age to 0 and 6, the values correspond to a male with the age between 44 and 66. The probability of surviving, as you can see, dropped to 42.0%.
Conclusion
What we’ve just demonstrated is an automated ML production environment from data collection and preparation to model deployment and operationalization. A logical next step would be to store all our code into a versioned Git repository and use a continuous integration (CI) process to automate the pipeline initiation, train and test automation, review and approval process.
Code repository: https://github.com/pietheinstrengholt/Azure-ML-and-DevOps-meets-Titanic