The world’s leading publication for data science, AI, and ML professionals.

Car Classification using Inception-v3

Article on training 3 models to classify the Make, Model and Year of a car using Monk and deploying them through a Flask API.

Photo by Olav Tvedt on Unsplash
Photo by Olav Tvedt on Unsplash

Introduction

This article is about training 3 deep convolutional neural networks using Monk, which is an open source library for Computer Vision, and then deploying them through an API. The models take an image of a car as the input and then predict the Make, Model and Year of the car. The models have been trained on the Cars Dataset.

For transfer learning, the Inception-v3 architecture with pre-trained weights was used. Some initial layers were frozen and training was done on the remaining layers.

After training, the models were deployed through a Flask API. It accepts an image through a POST request and returns the predictions to the user.

For the training notebook, check this.

For the Flask API, check this.

Table of Contents

  1. Installing Monk
  2. The Dataset
  3. Training the models
  4. Results of Training
  5. Deploying the models through API
  6. Running the API
  7. Conclusion

1. Installing Monk

Monk is an open source computer vision library. You don’t need in-depth knowledge of Python or any Deep Learning framework to be able to use it. It simplifies computer vision by providing wrapper functions for popular deep learning frameworks and enables one to use their functionalities with minimal code. Check its GitHub repository for more information.

This article uses the PyTorch backend of Monk library, but you can install any other version of Monk if you want to. Check the detailed instructions for installation here.

  • CPU (Non GPU) : pip install -U monk-pytorch-cpu
  • Google Colab : pip install -U monk-colab
  • Kaggle : pip install -U monk-kaggle
  • For versions that support CUDA, follow the instructions provided here.

To install the library manually, follow the instructions provided here.


2. The Dataset

The training dataset used for this task is Cars Dataset. It contains 16,185 images of 196 classes of cars. Classes are typically at the level of Make, Model, Year, e.g. Tesla Model S 2012 or BMW M3 coupe 2012. The dataset is accompanied with a devkit that contains the labels for each image, as well as the coordinates of the bounding box around the car. But we’ll only use the labels. The code given here is intended to be run in a python notebook.

Download the dataset:

# Create a directory for the dataset
! mkdir data
# Download train dataset and extract it
! wget "http://imagenet.stanford.edu/internal/car196/cars_train.tgz"
! tar -xvf 'cars_train.tgz' -C 'data'
# Download test dataset and extract it
! wget "http://imagenet.stanford.edu/internal/car196/cars_test.tgz"
! tar -xvf 'cars_test.tgz' -C 'data'
# Download the devkit and extract it
! wget "https://ai.stanford.edu/~jkrause/cars/cars_devkit.tgz"
! tar -xvf 'cars_devkit.tgz' -C 'data'

Preparing the labels:

The files in devkit needed for preparing labels:

  • cars_meta.mat: Contains a cell array of class names, one for each class.
  • cars_train_annos.mat: Contains the variable ‘annotations’, where each element has the coordinates of the bounding box, a field ‘class’ which is the integral class id of the image, and a field ‘fname’ which is the filename of the image within the folder of images.

To prepare the labels for training:

  • First, we process the file cars_meta.mat to separate the make, model and year for each class id.
  • Then, we process the file cars_train_annos.mat to assign label each image in the dataset with a make, model and a year.

  • A similar procedure can be followed to assign labels to the test images after downloading cars_test_annos_withlabels.mat from here.
  • Alternatively, the prepared csv files can be downloaded from here.

Directory Structure:

./Project_directory/
|
|-------data (for dataset)
|         |
|         |------cars_test
|         |         |----------00001.jpg
|         |         |----------........(and so on)
|         |------cars_train
|         |         |----------00001.jpg
|         |         |----------........(and so on)
|         |------devkit
|         |         |----------cars_meta.mat
|         |         |----------cars_train_annos.mat
|         |         |----------........(and other files)
|                               _
|------vehicles_make.csv         |
|------vehicles_model.csv        |  (csv files with labels)
|------vehicles_year.csv        _|
|
|------.......(and other files/folders)

3. Training the models

When using Monk, experimenting with the models becomes very easy. By changing a few parameters, we can quickly see how it affects the overall performance of the model. It also speeds up the prototyping.

We’ll use the PyTorch backend of Monk. However, Keras and MXNet-gluon backends are also available.

Here, I’ll explain the procedure to train the Make classifier. The other two classifiers can be trained in a similar way. You can find the entire training notebook here.

Imports and opening a project:

Assign the dataset:

Set Model parameters:

We’ll load the Inception-v3 model with pre-trained weights for training the classifiers using transfer learning. This usually makes the model perform better when the training dataset is not big enough.

I had also tried training the ResNet-50 model, but its performance was not nearly as good as Inception-v3.

Set training parameters:

For now, we’ll train the model for 5 epochs. The training can be continued from the final epoch if the learning rate and other hyper parameters are good enough. We’ll use softmax cross entropy as the loss function since it is usually great for classification tasks. For the optimiser, I had experimented with stochastic gradient descent, but RMSProp seemed to perform better.

Training:

This will save the models obtained after every epoch along with some additional information related to the training into the workspace directory.

Detailed summary of the training can be obtained by running ptf.Summary().


4. Results of Training

We’ll run the models on the entire test data to get the test accuracy, and then run them on some individual images.

Evaluate the model on test data:

The below code returns the overall test accuracy as well as the individual class-based accuracies, that can be used to gain some useful insights about its performance.

Running the trained model for some images:

Results obtained:

The models were evaluated on the entire test data of 8,041 images. Validation set was a 20% split from the original training dataset. Accuracies obtained by the models:

  1. Make Classifier: Best validation accuracy: 94.72%. Test accuracy: 84.27%
  2. Model Classifier: **** Best validation accuracy: 96.50%. Test accuracy: 83.99%
  3. Year Classifier: **** Best validation accuracy: 94.17%. Test accuracy: 83.19%
Predictions obtained for images from test data
Predictions obtained for images from test data

5. Deploying the models through API

When training the models with Monk, it automatically creates a workspace directory. It contains all the training logs and all the intermediate models obtained during training. For developing the API, we just need the final models with the same directory structure that they were created in. If you have trained your models, you can use them. Otherwise, you can download the workspace with the final models from here.

If you just want to test the API, check this. The readme file has detailed instructions to set up the environment.

Create a sub-directory named ‘uploads’ in the project directory, wherein the files uploaded by the user would be saved before returning the predictions. Set up a virtual environment and install the required libraries. A virtual environment is not necessary, but is recommended. Download the requirements.txt file from here and run pip install -r requirements.txt

Create a file named app.py in the project directory where the workspace is located. We’ll write the code for the API in this file.

Imports and utility functions:

Functions to respond to HTTP requests:

I have included a user interface for the API, but it is not necessary. If you wish use it as well, download [this](https://github.com/PiyushM1/Car-classification-API/tree/master/templates) and this directory into the project directory, and then define the index() function given below to make the API load the webpage when accessed through a browser.

The upload() function responds to the POST requests at ‘/predict’, saves it to a subdirectory named ‘uploads’ and returns a string with the predictions if the file is a valid image.

Driver function:

It loads the models and starts the server.


6. Running the API

Use the command python3 app.py in your terminal to run the app.

Once the server has started, you can test the API by sending a POST request using cURL. To do so, you’ll first need to install cURL if you don’t have it already. Then, through your terminal, run the following command after replacing with a valid path for an image.

curl -X POST -F file=@'<image_path>' 'http://0.0.0.0:5000/predict'

This will return the prediction to your terminal itself.

Alternatively, you can go to http://0.0.0.0:5000 in your browser for the user interface. Then upload any image using the Choose button, and click Predict. It will then return a prediction like this:

User interface for the API
User interface for the API

7. Conclusion

This article covered the entire process including data preparation, training image classification models, and finally deploying them through a Flask API. We also used data normalisation, random horizontal flipping, transfer learning, custom optimiser, learning rate and loss function. Upon evaluating the models on test data, they performed really well with an accuracy close to 85%. However, I was a bit puzzled with the year classifier having such a good accuracy, given that there really aren’t any obvious features that could help in predicting the manufacturing year of a car. I’m not sure if it is a good idea to try and predict the year with just an image.

You should also try to tweak some hyper parameters, or use a different model architecture and see how it goes. It is also possible to test multiple combinations of models or hyper parameters simultaneously with Monk, check it out here.


References:

  1. GitHub repository for the training notebook: https://github.com/PiyushM1/Car-make-model-and-year-classifier
  2. GitHub repository for the API: https://github.com/PiyushM1/Car-classification-API
  3. Dataset: https://ai.stanford.edu/~jkrause/cars/car_dataset.html
  4. Monk library: https://github.com/Tessellate-Imaging/monk_v1

Thanks for reading! Let me know if you found this article helpful. Let’s connect through LinkedIn.


Related Articles