Image Processing with Python — Blob Detection using Scikit-Image

How to identify and segregate specific blobs in your image

Published in

Towards Data Science

6 min readJan 27, 2021

One of the most important skills a data scientist needs when working with images is being able to identify specific parts of the image. An image only becomes useful if specific points of interest can be identified and itemized. In this article we will learn how to do just that.

Let’s get started!

As always let us first import all the required libraries we need for this article.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import skimagefrom skimage.io import imread, imshow
from skimage.color import rgb2gray, rgb2hsv
from skimage.measure import label, regionprops, regionprops_table
from skimage.filters import threshold_otsu
from scipy.ndimage import median_filter
from matplotlib.patches import Rectangle
from tqdm import tqdm

Excellent, now let us load the image we will be working on.

tree = imread('laughing_tree.png')
imshow(tree);

We will work with the above image. Our task is to identify and segregate the sections of the image that contain the tree’s unique fruit (which looks like an open mouth).

The first thing we should do is try to see if there is any easy way to identify the image based on the value. Let us convert the image to greyscale and use Otsu’s Method to see if this gives us a decent mask.

tree_gray = rgb2gray(tree)
otsu_thresh = threshold_otsu(tree_gray)
tree_binary = tree_gray < otsu_thresh
imshow(tree_binary, cmap = 'gray');

This is clearly not working out well, let us try to iterate over several threshold levels and see if we can find a threshold that will produce a better mask.

def threshold_checker(image):
    thresholds =  np.arange(0.1,1.1,0.1)
    tree_gray = rgb2gray(image)
    fig, ax = plt.subplots(2, 5, figsize=(17, 10))
    for n, ax in enumerate(ax.flatten()):
        ax.set_title(f'Threshold  : {round(thresholds[n],2)}',      
                       fontsize = 16)
        threshold_tree = tree_gray < thresholds[n]
        ax.imshow(threshold_tree);
        ax.axis('off')
    fig.tight_layout()threshold_checker(tree)

Binarized Images at different threshold levels

We can see that though the thresholding seems to help, it still includes significant portions of the image that we are not interested in. Let us try another approach.

tree_hsv = rgb2hsv(tree[:,:,:-1])
plt.figure(num=None, figsize=(8, 6), dpi=80)
plt.imshow(tree_hsv[:,:,0], cmap='hsv')
plt.colorbar();

If we put the image into an HSV Colorspace, we can see that the fruits clearly have a red hue that is not present in other portions of the image. Let us try segregating these sections of the image.

lower_mask = tree_hsv [:,:,0] > 0.80
upper_mask = tree_hsv [:,:,0] <= 1.00
mask = upper_mask*lower_mask
red = tree[:,:,0]*mask
green = tree[:,:,1]*mask
blue = tree[:,:,2]*mask
tree_mask = np.dstack((red,green,blue))
plt.figure(num=None, figsize=(8, 6), dpi=80)
imshow(tree_mask);

We see that along with the fruits, large portions of the skylight sections are also retained. Referring to the previous Hue channel image, we can see that these sections also have the same kind of red present in the fruits.

To go around this, let us check the Value channel of the image.

tree_hsv = rgb2hsv(tree[:,:,:-1])
plt.figure(num=None, figsize=(8, 6), dpi=80)
plt.imshow(tree_hsv[:,:,2], cmap='gray')
plt.colorbar();

We can see that those brightly lit areas have incredibly high values. Let us take this into account when we create the mask.

lower_mask = tree_hsv [:,:,0] > 0.80
upper_mask = tree_hsv [:,:,0] <= 1.00
value_mask = tree_hsv [:,:,2] < .90
mask = upper_mask*lower_mask*value_mask
red = tree[:,:,0] * mask
green = tree[:,:,1] * mask
blue = tree[:,:,2] * mask
tree_mask = np.dstack((red, green, blue))
plt.figure(num=None, figsize=(8, 6), dpi=80)
imshow(tree_mask);

Great! We are almost there. We now have to find a way to clean the image and remove the little white dots. For this, we can simply use the median_filter function in the Skimage library.

lower_mask = tree_hsv [:,:,0] > 0.80
upper_mask = tree_hsv [:,:,0] <= 1.00
value_mask = tree_hsv [:,:,2] < .90
mask = median_filter(upper_mask*lower_mask*value_mask, 10)
red = tree[:,:,0] * mask
green = tree[:,:,1] * mask
blue = tree[:,:,2] * mask
tree_mask = np.dstack((red, green, blue))
plt.figure(num=None, figsize=(8, 6), dpi=80)
imshow(tree_mask);

We can see that incorporating the Median Filter gets us an extremely clean image. Now we need to identify each blob, to do this we need to make use of the label function in Skimage.

tree_blobs = label(rgb2gray(tree_mask) > 0)
imshow(tree_blobs, cmap = 'tab10');

We can see that the function identifies the different blobs in the image. The next step now is to get the properties of each blob. To do this we must make use of the regionprops_table function in Skimage.

properties =['area','bbox','convex_area','bbox_area',
             'major_axis_length', 'minor_axis_length',
             'eccentricity']
df = pd.DataFrame(regionprops_table(tree_blobs, properties = properties))

The regionprops_table function gives us the properties of each of the blobs in a convenient pandas DataFrame. This allows us to easily manipulate the data and pin point specific blobs. As an example of how useful this DataFrame is, let us use the bbox feature to draw bounding boxes on the image.

blob_coordinates = [(row['bbox-0'],row['bbox-1'],
                     row['bbox-2'],row['bbox-3'] )for 
                    index, row in df.iterrows()]fig, ax = plt.subplots(1,1, figsize=(8, 6), dpi = 80)
for blob in tqdm(blob_coordinates):
    width = blob[3] - blob[1]
    height = blob[2] - blob[0]
    patch = Rectangle((blob[1],blob[0]), width, height, 
                       edgecolor='r', facecolor='none')
    ax.add_patch(patch)
ax.imshow(tree);
ax.set_axis_off()

If we look carefully we can see that there is a single bounding box on the upper left of the image. The object within the bounding box is clearly not a fruit. But how do we get rid of it?

Well, what we can do is filter the pandas DataFrame. For simplicity we shall filter it via the eccentricity column, this is because of the unique shape the outlier blob has.

df = df[df['eccentricity'] < df['eccentricity'].max()]

If we plot out the bounding boxes again we see that we were able to successfully filter out the blob.

Lastly, let us cut out the bounding boxes from the image and display them as their own images.

fig, ax = plt.subplots(1, len(blob_coordinates), figsize=(15,5))
for n, axis in enumerate(ax.flatten()):
    axis.imshow(tree[int(blob_coordinates[n][0]):
                     int(blob_coordinates[n][2]), 
                     int(blob_coordinates[n][1]):
                     int(blob_coordinates[n][3])]);
    
fig.tight_layout()

Excellent, we have successfully identified the interesting mouth-like fruits in the image. The images can now be saved into a file and used later on (perhaps for a machine learning project).

In Conclusion

Knowing how to do blob detection is a valuable skill for any data scientist working with images. It can be used to separate different sections of an image into different points of interest. You can actually use this technique to create the data that will be fed to your machine learning algorithm. Though this was a relatively simple and straightforward lesson, I hope you now have an idea of how to use blob detection to solve basic image problems.

Image Processing with Python — Blob Detection using Scikit-Image

How to identify and segregate specific blobs in your image

Written by Tonichi Edeza