RetinaNet: Custom Object Detection training with 5 lines of code

Making computer vision easy with Monk, low code Deep Learning tool and a unified wrapper for Computer Vision.

Akula Hemanth Kumar
Towards Data Science

--

Indoor Object detection

In a previous article, we have built a custom object detector using Monk’s EfficientDet. In this article, we will build an Indoor Object Detector using Monk’s RetinaNet, built on top of PyTorch RetinaNet.

These days, computer vision is used everywhere from Self-driving cars to surveillance cameras and whatnot. To get started with computer vision we have to learn various Deep learning frameworks like TensorFlow, PyTorch, and Mxnet which is a tedious process.

With this note, I would like to introduce you to Monk, a fully-functional, low code easily-installable object detection pipelines.

Let’s get started!!

Table of Contents

  1. Data Collection
  2. Convert to COCO format
  3. Training model
  4. Testing object detector

Data Collection

Here we are using OpenImages. Data is collected using OIDv4_ToolKit. I have chosen 25 classes from the dataset, you can choose as many as you want.

Example Command used to collect data

python main.py downloader --classes Apple Orange --type_csv validation

Open the command prompt, and run the following commands to collect the Alarm_clock class from the entire dataset. You can use this process to download other classes.

$ git clone https://github.com/EscVM/OIDv4_ToolKit
$ cd OIDv4_ToolKit
$ python main.py downloader --classes Alarm_clock --type_csv train
$ mv OID/Dataset/train/Alarm\ clock OID/Dataset/train/Alarm_clock

You can directly download the formatted dataset using

$ wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1bXzK3SYRCoUj9-zsiLOSWM86LJ6z9p0t' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1bXzK3SYRCoUj9-zsiLOSWM86LJ6z9p0t" -O OIDv4_ToolKit.zip && rm -rf /tmp/cookies.txt

Convert to COCO format

COCO format

./ (root_dir)
|
|------Dataset (coco_dir)
|
|------Images (set_dir)
| |
| |----Alarm_clock
| |
| |---------img1.jpg
| |---------img2.jpg
| |---------..........(and so on)
|
| |-----Curtain
| |
| |---------img1.jpg
| |---------img2.jpg
| |---------..........(and so on)
|
| |-----...........(and so on)
|
|
|
|------annotations
|----------|
|--------------------instances_Images.json (instances_<set_dir>.json)
|--------------------classes.txt
  • instances_Train.json -> In proper COCO format
  • classes.txt -> A list of classes in alphabetical order

For Train Set

  • root_dir = “OIDv4_ToolKit/OID/”;
  • coco_dir = “Dataset”;
  • img_dir = “./”;
  • set_dir = “Images”;

Note: Annotation file name to coincides against the set_dir

We Convert to COCO format via Monk format

1. Convert from the current format to Monk format.

2. Convert from Monk format to COCO format

To get classes.txt run

For ‘ .json’ file run

Training model

We chose “resnet50” for this experiment. You can set the hyperparameters as suggested in the code. If you’re using GPU then set use_gpu=True, the default will be False. we’re using 4 GPU’s, So gpu_devices=[0, 1, 2, 3]. If you’re using one GPU then change gpu_devices=[0]. Set number of epochs and model name with ‘.pt’ extension.

As mentioned in the title we need only 5 lines of code for training, here is Train.py

Testing object detector

After training the model, we can get the weights file. Load the weights and Start to predict.

Some image inferences, you can see:

Inference 1
Inference 2

You can find the complete code on Github. Give us ⭐️ on our GitHub repo if you like Monk.

In this experiment, we created a custom object detection using Retinanet with just basic programming skills without even knowing the architecture and PyTorch framework.

For more examples of custom object detection, checkout

If you have any questions, you can reach Abhishek and Akash. Feel free to reach out to them.

I am extremely passionate about computer vision and deep learning. I am an open-source contributor to Monk Libraries.

You can also see my other writings at:

Photo by Srilekha

--

--