Complementing Machine Learning Algorithms with Image Processing

A short intro to commonly used Image Processing Algorithms in Python

Rafael Madrigal
Towards Data Science

--

Photo by Sid Verma on Unsplash

Pics, or it did not happen. Taking photos of everyday moments has become today’s default. Last year, Keypoint Intelligence projected that humanity would generate 1,436,300,000,000 images. Mylio, an image organization solutions provider, even forecasted this number to hit 1.6 Trillion in 2022. Wow. That’s a lot of photos!

As a data scientist, learning to process and extract information from these images is crucial, especially in computer vision tasks like object detection, image segmentation, and self-driving cars. Several image processing libraries in Python like scikit-image, OpenCV, and Pillow/ PIL allow us to do precisely that.

How does Image Processing fit in the overall machine learning pipeline?

Sample Machine Learning Workflow with Image Processing (For Illustration Purposes Only). Photo by Author

We usually read and clean digital images using our preferred image processing library and extract useful features that can be used by machine learning algorithms.

In the sample pipeline above, we carved out each leaf from the source image. We applied image enhancements (i.e., white balancing, thresholding/ morphology, and histogram equalization). After that, we also measured each leaf’s geometric features such as convex area, perimeter, major and minor axes lengths, and eccentricity, which form our features table. Finally, we used these features as inputs to a classification algorithm and produced an acceptable F-1 score. We can mirror the last two steps to the popular iris dataset, but we generated the measurements ourselves this time. Isn’t that nice?

Digital Images as NumPy arrays in Python

We can represent digital images as a 3-D function F(x, y, z) where x, y, and z refer to spatial coordinates. In contrast, F refers to the intensity of the image (usually from 0 to 255) at that specific point. In Python (and in Linear Algebra), we can represent this function as a NumPy array with three dimensions (a tensor). The z-axis can be interpreted as image channels (e.g., RGB for Red, Green, and Blue).

There are several color models available for images; Mathworks provides a good discussion of the different color spaces here. Our image processing libraries allow us to convert from one color space to another with a single code line.

from skimage.color import rgb2hsv, rgb2gray
from skimage.data import coffee
im = coffee() # shape : (400, 600, 3); range from 0 to 255
hsv = rgb2hsv(im) # shape : (400, 600, 3); range from 0 to 255
gry = rgb2gray(im) # shape : (400, 600, 1); range from 0 to 255
bin = gry > 0.5 # shape : (400, 600, 1); composed of 0 AND 1 only
Coffee rendered in individual color channels. The first row shows the original image in RGB and its grayscale and binarized counterparts. The second and third rows show the image rendered in RGB and HSV spaces, respectively. Image from scikit-image processed by the author.

Common Image Processing Algorithms

  • Image Enhancements. In neural network applications, we flip, rotate, upsample, downsample, and apply shear to images to augment the existing dataset. Sometimes, we want to correct the contrast and white balance of an image to accentuate the desired object or region. In the example below, we enhance the dark image using a linear function, allowing us to see the things that were not visible in the original image.
def apply_histogram_correction(im, target_dist,             
target_bins=range(256):
"""Apply a histogram correction to an image `im` using a target distribution `target_dist`. The correction is applied at every pixel intensity bin `target_bins`."""
freq, bins = cumulative_distribution(im)
corrected = np.interp(freq, target_dist, target_bins)

return img_as_ubyte(corrected[im].astype(int))
Histogram Equalization using a linear distribution. Histogram equalization was performed in every color channel generated by the author.
  • Morphological Filtering. In morphological filtering, we use a structuring element or kernel that inhibits or enhances regions of interest in a pixel neighborhood. Some common morphological operations include erosion, dilation, opening, closing, skeletonizing, and removing small holes.
from skimage.morphology import (erosion, dilation, opening, closing, 
skeletonize, remove_small_holes)
from skimage.morphology import disk
from skimage.date import shepp_logan_phanto
from skimage.util import img_as_ubyte
im = img_as_ubyte(shepp_logan_phantom()) # Original Image
im_open = im.copy()
im_open[10:30, 200:210] = 0
selem = disk(6) # Structuring element# Morphological Transformations
eroded = erosion(im, selem)
dilated = dilation(im, selem)
opened = opening(im, selem)
closed = closing(im_open, selem)
skeleton = skeletonize(im==0)
no_holes = remove_small_holes(im, area_threshold=64)
Different morphological operations are available in scikit-image. Skeletonized and Remove small holes do not require a structural element, but they require a threshold value specification.
  • Blob Detection. In image processing, blobs are defined as bright on dark or dark on bright regions in an image. Detected blobs usually signal an object or parts of an object in an image that helps object recognition and/or objects tracking. The three most common algorithms for blob detection are Laplacian of Gaussian, Difference of Gaussian, and Determinant of Hessian. All of which are based on derivatives of the function with the position.
from skimage.data import hubble_deep_field
from skimage.feature import blob_dog, blob_log, blob_doh
from skimage.color import rgb2gray
import math
im = hubble_deep_field()[:500, :500]
im_gry = rgb2gray(im)
# Laplacian of Gaussian
blobs_log = blob_log(im_gry, max_sigma=30, num_sigma=10, threshold=.1)
blob_log[:, 2] = blobs_log[:, 2] * math.sqrt(2)
# Difference of Gaussian
blobs_dog = blob_dog(im_gry, max_sigma=30, num_sigma=10, threshold=.1)
blob_dog[:, 2] = blobs_dog[:, 2] * math.sqrt(2)
# Determinant of Hessian
blobs_doh = blob_doh(im_gry, max_sigma=30, threshold=0.01)
Results of different blob detection algorithms available in scikit-image. Taken from scikit-image documentation examples.
  • Feature Extraction. Feature extraction takes advantage of connected components in the image. In the leaf classification example above, we isolated each leaf. We computed for each leaf’s area, perimeter, and eccentricity, among others. We use these measurements as inputs to our machine learning algorithm. Features can be quickly extracted using the ‘regionprops’ module in scikit-image.
from skimage.measure import label, regionprops
from skimage.color import rgba2gray
from skimage.io import imread
im = rgba2gray('candies.png')
im_bw = im < 0.5
im_clean = morphological_ops(im_bw)
im_label = label(im_clean)
props = regionprops(im_label)
props[0].area
### Output: 4453 Refers to the Area of the image with label==0
Process of Image Labelling for Feature Extraction: Convert to Grayscale > Binarize > Label Image. Each blob is a set of connected pixels or elements in the array—image generated by the Author.
  • Image Segmentation. In image segmentation, we want to isolate some of the images through several thresholding operations. An example of this is binarizing a grayscale image using a > or < operator to generate a boolean mask. In some cases, we can use other color spaces like HSV or RGB in generating a threshold that will isolate an object of interest. In the example below, we generated a mask to isolate all the grapes in the fruit basket.
from scipy.ndimage import median_filter
from skimage.color import rgb2hsv
from skimage.io import imread
fruits = imread('fruits.jpg)
fruits_hsv = rgb2hsv(fruits)
hmask = fruits_hsv[:, :, 0] > 0.7 # Hue Filter
hmask1 = fruits_hsv[:, :, 0] < 0.9 # Hue Filter
smask = fruits_hsv[:, :, 2] > 0.3 # Saturation Filter
mask = median_filter(smask*hmask*hmask1, size=25)
Image Segmentation. Isolating all the grapes using thresholds in the HSV color space. Image generated by Author.

Wrapping It Up

This article discussed how we could incorporate image processing in the machine learning pipeline and showcased the different image processing algorithms commonly used in the field. In the next set of articles, we will be diving deeper into each of the common algorithms I shared here and discussed the concepts behind them.

--

--