Preparing Open Images Dataset for TensorFlow Object Detection

A guide to creating dataset for Tensorflow Object Detection using Open Images

Het Pandya
Towards Data Science

--

Image by Andrew Neel on Unsplash

Data is one of the strongest pillars to consider while building a deep learning model. The more precise the data is, the more experienced the model is. To train a deep learning model, you need lots of data which you might want to create your own or you can use the public datasets available across the internet such as MS COCO, ImageNet, Open Images, etc.

Sometimes, these datasets follow a different format and the model you want to train follows another. Making the data usable by the model becomes a hassle when the data is very large. This what I stumbled open while creating my own object detector.

Object Detection is a branch of computer vision where you locate a particular object in an image. I used the Tensorflow Object Detection API to create my custom Object Detector. To create my detector, I created my data from the Open Images V4 Dataset. The dataset has a collection of 600 classes and around 1.7 million images in total, split into training, validation and test sets. It has been updated to V6 but I decided to go with the V4 because of two tools that we will look at soon.

To train a Tensorflow Object Detection model, you need to create TFRecords, which uses the following:

1. Images

2. Annotations for the images

Open Images has both, the images and their annotations. But,all the annotations are clubbed in a single file which gets clumsy when you want data for specific classes. To handle this, I decided to convert the data into PASCAL VOC format. Now, you might ask what is PASCAL VOC? In short, the PASCAL VOC format has an XML file created per image that has the co-ordinates of the bounding boxes of each object in the image. Pretty sorted, right? Here is an amazing reference if you wish to know more about PASCAL VOC.

Now enough of talking, let’s see this in action!

But but but! Before we begin, you might want to walk along with me, so here is the notebook from my github repo. LET’S GET STARTED!

I suggest using google colab because some files that the tools might need are large and may not be useful for future use cases.

  1. This tool lets us download images of specific classes and in a specific amount. Download the tool by cloning the repo.
  2. Once downloaded, you’ll find a classes.txt file. Here you need to mention the class names whose data you want to collect. You can find the list of classes here. For example, I shall take the class ‘Mobile phone’ and 98 images for the same.
  3. Paste the following code:

When you run the script for the first time, the tool might ask you to download some files, allow it. Here’s how the prompt will look like

Screenshot of Open Images Dataset Toolkit

The tool creates directories per split and again sub-directories for each class in those directories.

4. Now let’s take a look at the number of files for each class using the following script:

Output to view files in Mobile Phone directory

Hmm.. but I mentioned 98 images, right? It creates a folder named ‘Label’ that contains the annotations for each image.

Let’s see what the folder ‘Label’ contains:

Content in the Label directory

We get one annotation file per image. But, are we done? Nope! The annotations are not according to the PASCAL VOC format. So, we need an another tool that can convert these annotations into the required format.

This tool does the final job for us. Clone it. I have modified this tool slightly for classes with more than one word in their names which you can find on my repo here.

Once done, run the following command to convert the annotation files:

Here, sourcepath is the location of ‘Label’ folder for each class and dest_path is where your XML annotations will be stored.

Once the XML annotations are created, we no longer need the txt annotations. Let’s remove them:

This removes all the folders named ‘Label’ in all the class directories.

Now you have your filtered dataset! 😀

Tip : Once the dataset is ready, verify the annotations using labelImg. The objects may be wrongly annotated sometimes.

That’s it folks! Thank you for reading 😊

--

--

Hi, I am a machine learning enthusiast, passionate about counting stars under the sky of Deep Learning and Machine Learning.