The world’s leading publication for data science, AI, and ML professionals.

Predicting Bank Customer Churn using Microsoft Azure Machine Learning & Python in Jupyter Notebook

Develop and train an Artificial Neural Network using Microsoft Azure Cloud Services and Python

Learn how easy it is to code and run a notebook using Azure resources and Python

Photo by Kevin Ku from Unsplash
Photo by Kevin Ku from Unsplash

In this tutorial, we will build an Artificial Neural Network (ANN) for predicting the bank customer churn using python notebook and Microsoft Azure services. Please note that there is a lot Azure Machine Learning (ML) provides and in this article, I’ll demonstrate how easy it is to code and run a notebook using Azure resources and python.

Pre-requisite

  1. You should be familiar with machine learning. Even if you don’t understand, try to follow along and keep practicing.
  2. You should be aware of Tensorflow. You can find tons of material online and learn about it.
  3. You must have a Microsoft Azure subscription. Azure provides free credits to college or university students. Create an account here.

Creating Azure Machine Learning service

  1. Go to the Azure portal and click on the ‘+’ symbol to create a resource.
  2. Search for ‘Machine Learning’ and click on create.
  3. Enter the details and click on ‘review and create’
  4. Finally, click on ‘create’ to start the process of deployment.
  5. Your service should be ready to use after around 3–4 minutes and you should see something like this:

Azure Machine Learning

Azure Machine learning Workspace
Azure Machine learning Workspace
  1. Go to the Azure Machine learning portal.
  2. Make sure you login with the same email id as you used to login to the Azure portal.
  3. Take a moment to understand the information being asked.
  4. Select the correct subscription and the workspace. Remember the workspace we created earlier, you will find it here to select.
  5. Once done, click ‘Get Started’. Your workspace is ready to be used!
  6. This is what it should look like. I would highly recommend that you spend a great amount of time seeing the services provided. At the bottom, you can find nice documentation to help you get started.
  7. In the Azure Machine Learning workspace, select ‘create new’ and then ‘Notebook’.
  8. A dialog pops up asking you to name your file. Here I am creating a file named ‘CustomerChurn.ipnyb’.
  9. Download this dataset from Kaggle and upload it on the Azure ML portal under your Notebook folder along with your ipnyb file.

Create compute

Open your notebook and in the first cell type ‘7+5’ and run it using the little triangle (run) button on the left side of that cell. You should see below:

Create Compute
Create Compute

Compute is an important concept without which you will not be able to run a cell (or entire notebook). Follow along to create a compute:

  1. Click on ‘click compute’ as shown above.
  2. Select the virtual machine size you would like to use for your compute instance.
  3. Toggle between CPU or GPU virtual machine types. When using GPU-enabled virtual machines, make sure your code is written to leverage available GPU devices.
  4. Select other options as needed. If you are just trying Azure, I recommend always choose options that are either free or has the least cost.
  5. Select ‘create’ and it will take around 10–12 minutes to create a virtual machine to be ready for use. Now you should be able to run your cell.

Data Visualization, Analysis & Cleaning

Now that we are ready with all the ingredients (dataset, compute, notebook setup), it’s time to do the magic!

Here I will explain only important code snippets, for the sake of simplicity. You can check out the entire notebook on Github.

The first step is to read the dataset and understand the attributes’ data types, unwanted attributes, etc.

The second step is to convert categorical values to numeric because ML models work on numeric data. For example, the below code replaces the value ‘Female’ with 1 and the value ‘Male’ with 0 in the entire dataset.

df['Gender'].replace({'Female':1,'Male':0},inplace=True)    

Having a scaled data is helpful in training an ANN. In our dataset, some attributes are not scaled.

col_to_scale = ['CreditScore','Age','Tenure','Balance','EstimatedSalary','NumOfProducts']

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

df1[col_to_scale] = scaler.fit_transform(df1[col_to_scale])

The below code plots the number of people who exited and who didn’t base on the geographical location.

Output in Microsoft Azure Machine Learning Notebook
Output in Microsoft Azure Machine Learning Notebook

Creating the Train and Test split

We need to split the dataset into train and test datasets. This is a pretty easy task, thanks to the sklearn module. To learn more about it watch this pretty awesome video.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=5)

Creating the ANN

I am going to build the model using TensorFlow/Kera. Learn more about it here. Take some time to truly understand the code. Remember below values are based on hit and trial. I ran this model many times before finalizing these values (Got maximum accuracy with these values).

import tensorflow as tf
from tensorflow import keras

model = keras.Sequential([
    keras.layers.Dense(12, input_shape=(12,), activation='relu'),      #12 because number of inputs is 12
    keras.layers.Dense(6, activation='relu'),                           # hidden
    keras.layers.Dense(1, activation='sigmoid')
])

# opt = keras.optimizers.Adam(learning_rate=0.01)

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=100)

Predictions & Accuracy

Once all epochs run, you can evaluate the model using

model.evaluate(X_test,y_test)

Classification reports & Confusion matrix

Classification reports are a great way to get precision, recall, f1-score, and support values for each prediction class.

A confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm.

Caution

If you get an error "No Module named seaborn", run the below command before using seaborn.

pip install seaborn

The above command will install the necessary library.

Delete the Azure Resources

Since Azure charges on the services being used, it is always advised to delete the resources you are not using any longer.

In this blog, you created two resources- Machine learning and resource group.

Go to the Azure portal home page to find these two resources. Open them and delete the resource. Once the deletion is successful, you will not be charged for that resource.

Conclusion

I demonstrated how to create an ML service. Also, a workspace in Microsoft Azure ML was created. Then we downloaded the dataset from Kaggle and analyzed it so that it can be fed to ANN. With the help of TensorFlow, an ANN was created and trained. As already mentioned, there is literally tons of integration and features you can utilize depending on the work in hand and your preference. This blog must have given you a tiny peek into it and help you get started today!

Also, you can ask me a question on Twitter and LinkedIn!

References

[1] Tutorial: Get started in Jupyter Notebooks (Python) – Azure Machine Learning. https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-setup

[2] codebasics. (2018, August 6). Machine Learning Tutorial Python – 6: Dummy Variables & One Hot Encoding [Video]. YouTube. https://www.youtube.com/watch?v=9yl6-HEY7_s&list=PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw&index=6&ab_channel=codebasics

Hope this is helpful to you.

Thank you.


Related Articles