The world’s leading publication for data science, AI, and ML professionals.

Build your own Annotation Tool for Image Classification in 5 Minutes

How to label an image dataset with OpenCV and Python

Image generated with AI
Image generated with AI

Any time machine learning is applied to solve a problem, in some way the goal is to fit a model to some data. For your model to perform well and generalize to unseen data, you need to make sure that you use a high quality dataset for training. Especially in a supervised learning setting, you need to make sure that your data is accurately labeled.

Data is the most important part of machine learning.

No matter how large you make your model, how many billion parameters you throw at it or how much augmentation you put the data set through, poor input will not magically turn into high quality output.

Depending on the task you’re trying to solve, there is not always an adequate public dataset available. In these cases, you might need to build your own dataset. However, in the beginning your data is most likely not labeled. Let me show you, how we can build a simple, quick annotation tool to classify your image data from an unlabeled dataset.

Demo

Image Dataset

Sample from the dataset
Sample from the dataset

To demonstrate the annotation tool, I will be using an image dataset from my phone recordings, where the goal is to classify three different USB connector types: USB-A, USB-C, Micro USB and Mini USB. In the beginning, all images will be unlabeled in an input directory. Our annotation tool should then present us with the images one at a time and upon specifying the class move it to the appropriate directory.

Annotation Tool in action
Annotation Tool in action

Guide

Prerequisites

If you want to follow along, you should install Opencv-python. For some sample images, you can find some in the example folder in the project repository.

Data Loading

First, let’s start by loading our images from the input folder. We can use the glob function from pathlib to look for all files with a jpg image extension. Passing the result to the sorted function, we make sure that the images are processed in order.

from pathlib import Path

input_path = Path("input")
input_img_paths = sorted(input_path.glob("*.jpg"))

Let’s also prepare the output directory by making sure it exists.

output_path = Path("output")
output_path.mkdir(parents=True, exist_ok=True)

We can loop through our image list and load the images into arrays with cv2.imread. Let’s show the image and wait for a key press. By setting the delay in the cv2.waitKey function to 0, we wait indefinitely until a key is pressed. We then make sure that we can quit the application by pressing Q and at the end we close all opencv windows.

import cv2

...

def annotate_images(
    input_img_paths: list[Path],
    output_path: Path,
)-> None:

    for img_path in input_img_paths:
        img = cv2.imread(str(img_path))

        cv2.imshow("Image", img)

        while True:
            key = cv2.waitKey(0)

            # Quit Annotation Tool
            if key == ord("q"):
                return

        cv2.destroyAllWindows()

NOTE: Using the bitwise and (&) with 0xFF we only look at the last bits of the key that’s pressed. This makes sure that even if e.g. NumLock is activated, the numbers will still be identical to the ord function for the numbers.

Labeling

Let’s define the labels for our task in a list of strings. In my case, I have four labels for the different connectors:

...

def annotate_images(
    input_img_paths: list[Path],
    output_path: Path,
    labels: list[str],
) -> None:
    ...

annotate_images(
    input_img_paths=input_img_paths,
    output_path=output_path,
    labels=["usb_a", "usb_c", "usb_mini", "usb_micro"],
)

Now we want the number keys 0, 1, 2 and 3 to classify our image into the respective label folder. The key variable from the waitKey function is an integer representing the unicode code of the character pressed. To check if the key is one of the numbers, we need to convert the number to a unicode with the ord function, similar to how we check for the key q to close the window. The function expects a string of length 1, so we need to convert the index to a string before passing it to the function.

  ...

  while True:

    ...

    for i in range(len(labels)):
        if key == ord(str(i)):
            label = labels[i]
            print(f"Classified as {label}")

            # TODO: move to correct label folder

            break

To move the image to the classified label folder in the output path, we can use the / operation from pathlib to concatenate paths and then the rename function to move a file to a target destination.

...

if key == ord(str(i)):
    label = labels[i]
    print(f"Classified as {label}")

    output_img_path = output_path / label / img_path.name
    img_path.rename(output_img_path)

    break

Before we can do that though, we need to make sure that the target folder exists. So before the loop, we go through all our labels and create a respective folder.

...

# create all classification folders
for label in labels:
    label_dir = output_path / label
    label_dir.mkdir(parents=True, exist_ok=True)

while True:
    ...

An alternative and more pythontic way for the label key checking, would be to create a mapping of key unicodes to labels before the loop. In this way, we don’t need to loop over all keys every step of the loop.

# mapping from key to label
labels_key_dict = {ord(str(i)): label for i, label in enumerate(labels)}

while True:
    ...

    if key in labels_key_dict:
        label = labels_key_dict[key]
        print(f"Classified as {label}")

        output_img_path = output_path / label / img_path.name
        img_path.rename(output_img_path)

        break

Let’s also add a small help text for the key to label mapping.

for i, label in enumerate(labels):
    cv2.putText(
        img,
        f"{i}: {label}",
        (10, 30 + 30 * i),
        cv2.FONT_HERSHEY_SIMPLEX,
        1,
        (255, 255, 255),
        2,
        cv2.LINE_AA,
    )

Conclusion

In this tutorial, you learned how to create a simple annotation tool for an image classification task. There are a lot of things we could improve on this tool. One thing I want to explore further, is adding a possibility to not only classify the images, but also segment images and create segmentation masks.

Of course there are a ton of way more sophisticated tools out there that streamline your annotation process. However, sometimes a very simple tool is all you need, especially when doing exploratory data analysis in the early stages of a project and you need a quick proof of concept.

The full code for this post is available on my GitHub, you can find it below. Happy coding!

GitHub – trflorian/annotation-tool


All visualizations in this post were created by the author.


Related Articles