The world’s leading publication for data science, AI, and ML professionals.

Train a Custom Object Detector with Detectron2 and FiftyOne

Combine the dataset curation of FiftyOne with the model training of Detectron2 to easily train custom detection models

Image 71df582bfb39b541 from the Open Images V6 dataset (CC-BY 2.0) visualized in FiftyOne
Image 71df582bfb39b541 from the Open Images V6 dataset (CC-BY 2.0) visualized in FiftyOne

In recent years, every aspect of the Machine Learning (ML) lifecycle has had tooling developed to make it easier to bring a custom model from an idea to a reality. The most exciting part is that the community has a propensity for open-source tools, like Pytorch and Tensorflow, allowing the model development process to be more transparent and replicable.

In this post, we take a look at how to integrate two open-source tools tackling different parts of an ML project: FiftyOne and Detectron2. Detectron2 is a library developed by Facebook AI Research designed to allow you to easily train state-of-the-art detection and segmentation algorithms on your own data. FiftyOne is a toolkit designed to let you easily visualize your data, curate high-quality datasets, and analyze your model results.

Together, you can use FiftyOne to curate your custom dataset, use Detectron2 to train a model on your FiftyOne dataset, then evaluate the Detectron2 model results back in FiftyOne to learn how to improve your dataset, continuing the cycle until you have a high-performing model. This post closely follows the official Detectron2 tutorial, augmenting it to show how to work with FiftyOne datasets and evaluations.

Follow along in Colab!

Check out this notebook to follow along with this post right in your browser.

Screenshot of Colab notebook (image by author)
Screenshot of Colab notebook (image by author)

Setup

To start, we’ll need to install FiftyOne and Detectron2.

# Install FiftyOne
pip install fiftyone 
# Install Detectron2 from Source (Other options available)
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
# (add --user if you don't have permission)

# Or, to install it from a local clone:
git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2

# On macOS, you may need to prepend the above commands with a few environment variables:
CC=clang CXX=clang++ ARCHFLAGS="-arch x86_64" python -m pip install ...

Now let’s import FiftyOne and Detectron2 in Python.

Prepare the Dataset

In this post, we show how to use a custom FiftyOne Dataset to train a Detectron2 model. We’ll train a license plate segmentation model from an existing model pre-trained on the COCO dataset, available in Detectron2’s model zoo.

Since the COCO dataset doesn’t have a "Vehicle registration plate" category, we will be using segmentations of license plates from the Open Images v6 dataset in the FiftyOne Dataset Zoo to train the model to recognize this new category.

Note: Images in the Open Images v6 dataset are under the CC-BY 2.0 license.

For this example, we will just use some of the samples from the official "validation" split of the dataset. To improve model performance, we could always add in more data from the official "train" split as well but that will take longer to train so we’ll just stick to the "validation" split for this walkthrough.

Specifying a classes when downloading a dataset from the zoo will ensure that only samples with one of the given classes will be present. However, these samples may still contain other labels, so we can use the powerful filtering capability of FiftyOne to easily keep only the "Vehicle registration plate" labels. We will also untag these samples as "validation" and create our own splits out of them.

Next, we need to parse the dataset from FiftyOne’s format to Detectron2’s format so that we can register it in the relevant Detectron2 catalogs for training. This is the most important code snippet to integrate FiftyOne and Detectron2.

Note: In this example, we are specifically parsing the segmentations into bounding boxes and polylines. This function may require tweaks depending on the model being trained and the data it expects.

Let’s visualize some of the samples to make sure everything is being loaded properly:

Visualizing Open Images V6 training dataset in FiftyOne (Image by author)
Visualizing Open Images V6 training dataset in FiftyOne (Image by author)

Load the Model and Train!

Following the official Detectron2 tutorial, we now fine-tune a COCO-pretrained R50-FPN Mask R-CNN model on the FiftyOne dataset. This will take a couple of minutes to run if using the linked Colab notebook.

# Look at training curves in tensorboard:
tensorboard --logdir output
Tensorboard training metrics visualization (Image by author)
Tensorboard training metrics visualization (Image by author)

Inference & evaluation using the trained model

Now that the model is trained, we can run it on the validation split of our dataset and see how it performs! To start, we need to load the trained model weights into a Detectron2 predictor.

Then, we generate predictions on each sample in the validation set and convert the outputs from Detectron2 to FiftyOne format, then add them to our FiftyOne dataset.

Let’s visualize the predictions and take a look at how the model did. We can click the eye icon next to the "val" tag to view all of the validation samples on which we ran inference.

Detectron2 predictions visualized in FiftyOne. Pictures shown are licensed as CC-BY 2.0 (Image by author)
Detectron2 predictions visualized in FiftyOne. Pictures shown are licensed as CC-BY 2.0 (Image by author)

From here, we can use the built-in evaluation methods provided by FiftyOne. The evaluate_detections() method can be used to evaluate the instance segmentations using the use_masks=True parameter. We can also use this to compute mAP with the options being the COCO-style (default) or Open Images-style mAP protocol.

We can use this results object to view the mAP, print an evaluation report, plot PR curves, plot confusion matrices, and more.

                            precision    recall  f1-score   support

Vehicle registration plate       0.72      0.18      0.29       292

                 micro avg       0.72      0.18      0.29       292
                 macro avg       0.72      0.18      0.29       292
              weighted avg       0.72      0.18      0.29       292
The precision-recall curve generated in FiftyOne (Image by author)
The precision-recall curve generated in FiftyOne (Image by author)

From the PR curve, we can see that the model is not generating many predictions – resulting in many false negatives – but the predictions that are generated are often fairly accurate.

We can also create a view into the dataset to look at high-confidence false positive predictions to understand where the model is going wrong and how to potentially improve it in the future.

Incorrect predictions found in FiftyOne, picture shown is licensed CC-BY 2.0 (Image by author)
Incorrect predictions found in FiftyOne, picture shown is licensed CC-BY 2.0 (Image by author)

There are a few samples with false positives like this one that contain plates with characters that are not from the Latin alphabet. This indicates that we may want to introduce images from a wider range of countries into the training set.

From here, we can take findings like these to iterate on the dataset, improve the samples and annotations, then retrain the model. This loop of curation, training, and evaluation needs to be iterated upon until the model is of sufficient quality for your task.

Summary

Detectron2 and FiftyOne are two popular open-source tools designed to aid in the model and dataset sides, respectively, of ML model development. With just a couple of custom Python functions, you can use your FiftyOne-curated datasets to train a Detectron2 model and evaluate the results back in FiftyOne letting you develop models for your Computer Vision tasks more easily than ever!


About Voxel51

Disclosure: I work at Voxel51 and am a developer of FiftyOne

Headquartered in Ann Arbor, Michigan, and founded in 2016 by University of Michigan professor Dr. Jason Corso and Dr. Brian Moore, Voxel51 is the AI software company behind FiftyOne, the open-source toolkit for building high-quality datasets and computer vision models.

Learn more at fiftyone.ai!


Related Articles