Introduction to Transfer Learning

A comprehensive guide on transfer learning from scratch

Muktha Sai Ajay
Towards Data Science

--

Introduction

We as humans have the ability to transfer the knowledge gained in one task to use it another task, the easier is the task the easier is to utilize the knowledge. Some simple examples would be:

  • Know Math and Statistics → Learn Machine Learning
  • Know how to ride a bicycle → Learn how to ride a Motor Bike

Most of the Machine Learning and deep learning algorithms so far are designed to focus on solving specific tasks. These algorithms are again rebuilt if any distribution changes and it would be hard to rebuilt and retrain it as it requires computational power and a lot of time.

Transfer learning is all about how to use a pre-trained network and apply it to our custom task, transferring what it learned from previous tasks.

Transfer learning is where we take architecture like VGG 16 and ResNet which are the result of many architectures and extensive hyperparameter tuning based on what they have already learned we apply that knowledge to a new task/model instead of starting from scratch which is called Transfer learning.

Some of the Transfer Learning models include:

  • Xception
  • VGG16
  • VGG19
  • Resnet, ResnetV2
  • InceptionV3
  • MobileNet

Implementing Medical Application Using Transfer Learning

In this application, we will detect whether the person is having pneumonia or not. We use a Kaggle dataset for this classification. The link to the dataset and code is given below.

Dataset Link:

Code Link:

The Dataset consists of a Train set and Test set with subfolders as normal and pneumonia. Pneumonia has the chest x-ray images of people who are suffering from pneumonia and the normal folder has images who are normal i.e free from lung disease.

Installing Tensorflow

You can either use Google Colab if your PC or laptop doesn’t have a GPU or else you can use Jupyter Notebook. If you use your system, upgrade pip and then install TensorFlow as follows

tensorflow.org

Import Libraries

Re-Size Images

Here, we will be resizing all the images to 224*224 because we use the VGG16 model and it accepts images of size 224*224.

Train and Test path

We will specify the train and test path fro training.

Importing VGG16

Here, we will import the VGG16 model weights for our application. We should declare an image size to the model, which we have done in the previous step, argument 3 represents that the image will accept RGB image i.e the color image. To train our model, we use imagenet weights and include_top = False means it will remove the last layers from the model.

Training Layers

Models like VGG16, VGG19, Resnet, and others have been trained on thousands and thousands of images and these weights are used to classify thousands of classes so we use these models weights for classification of our model, so we do not need to train the model again.

Number of Classes

we use glob to find out the number of classes in our model. The number of subfolders in our train folder represents the number of classes in our model.

Flattening

Whatever we got as output from VGG16, we are going to flatten it and we removed the last layers from VGG16 so that we can keep our own output layer. we replace the last layer with the number of categories we have in our problem statement. We consider softmax as our activation function and we are appending it to x.

Model

We will wrap it into a model where the input refers to what we have from VGG16 and output refers to the output layer that we created in the previous step.

summary of model

The above picture is the summary of our model and in the dense layer, we have two nodes because of two different categories we have pneumonia and normal.

Compile

We compile our model using categoricaa_cross_entropy as loss, adam optimizer, and accuracy as metrics. If you are not aware of what these terms mean, I will mention the link to my blogs at the last of the article where I explain all these terms in clear.

Pre-Processing

We will apply some transformations on the training images to avoid overfitting, if we do not perform we get a large difference between the accuracy of the training set and the test set.

We perform some geometrical transformations like flipping the image horizontally, flipping vertically, zoom in, zoom out and many more can be performed, we apply it so that our model won’t over learn our training images. We perform the above methods using the ImageDataGenerator class.

We don’t apply transformations for the test set because we only use them to evaluate, the only task to do with our test set is to rescale the images because in the training part we defined a target size for the image that could be fed into the network.

flow_from_directory will connect the Image Augmentation process to our training set. We need to mention the path of our training set. Target size is the size of the image that should be fed into the Neural Network. Batch size is defined as the no of images in a batch, the class mode would be categorical as we only have two outputs.

Now we define the test set that imports our test images from the directory. We do define the batch size, target size, and class mode as same mentioned in the training set.

Fit the Model

We will fit our model and declare the number of epochs as five, steps per epoch would be the length of the training set and validation steps would be the length of the test set.

That's great, we achieved an accuracy of around 97.7% and validation accuracy of 91.5%, that’s the power of transfer learning. Hope you enjoyed this tutorial on Transfer Learning. If you would like to know about how Artificial Neural Networks and Convolutional Neural Networks work along with an application, check out my blogs below:

--

--