Image Augmentation Examples in Python

Connor Shorten
Towards Data Science
4 min readSep 24, 2018

--

I am currently working on a study reviewing the depth and effectiveness of image data augmentations. The goal of this research is to learn how to increase our dataset size to train robust Convolutional Network models with limited or small amounts of data.

This study requires listing all the image augmentations we can think of and enumerating all of these combinations to try and improve the performance of an image classification model. Some of the most simple augmentations that come to mind are flipping, translations, rotation, scaling, isolating individual r,g,b color channels, and adding noise. More exciting augmentations are centered around using the Generative Adversarial Network model, sometimes swapping the Generator network with a Genetic Algorithm. Some creative methods have been proposed as well such as applying Instagram-style lighting filters to the images, applying random regional sharpening filters, and adding mean-images based on clustering techniques. This article will show you how to make augmentations on images using NumPy.

Below is a list and illustration of some of these Augmentation techniques, please leave a comment if you can think of any other ways to augment images that may improve the quality of an image classifier.

Original Image, (Pre-Augmentation)

AUGMENTATIONS

All Augmentations are done using Numpy without the OpenCV library

# Image Loading Code used for these examples
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
img = Image.open('./NIKE.png')
img = np.array(img)
plt.imshow(img)
plt.show()

Flipping

Flipping images is one of the most popular methods of image data augmentation. This is primarily due to the simplicity of the flipping code and how intuitive it is for most problems that flipped images would add value to the model. The model below could be thought of as seeing a left shoe rather than a right shoe, thus with this data augmentation, the model becomes more robust to the potential variations with seeing shoes.

# Flipping images with Numpy
flipped_img = np.fliplr(img)
plt.imshow(flipped_img)
plt.show()

Translations

It is easy to imagine the value of translational augmentation with classifiers whose purpose is detection. As if this classification model was trying to detect when the shoe is in the image vs. when it is not. These translations will help it pick up on the shoe without seeing the entire shoe in the frame.

# Shifting Left
for i in range(HEIGHT, 1, -1):
for j in range(WIDTH):
if (i < HEIGHT-20):
img[j][i] = img[j][i-20]
elif (i < HEIGHT-1):
img[j][i] = 0
plt.imshow(img)
plt.show()
# Shifting Right
for j in range(WIDTH):
for i in range(HEIGHT):
if (i < HEIGHT-20):
img[j][i] = img[j][i+20]
plt.imshow(img)
plt.show()
# Shifting Up
for j in range(WIDTH):
for i in range(HEIGHT):
if (j < WIDTH - 20 and j > 20):
img[j][i] = img[j+20][i]
else:
img[j][i] = 0
plt.imshow(img)
plt.show()
#Shifting Down
for j in range(WIDTH, 1, -1):
for i in range(278):
if (j < 144 and j > 20):
img[j][i] = img[j-20][i]
plt.imshow(img)
plt.show()

Noise

Noise is an interesting augmentation technique that I am starting to become more familar with. I have seen a lot of interesting papers on Adversarial training where you can throw some batch of noise into an image and the model will not be able to classify it correctly as a result. I am still looking at ways to generate better noise than the illustration below. Adding noise may help with lighting distortions and make the model more robust in general.

# ADDING NOISE
noise = np.random.randint(5, size = (164, 278, 4), dtype = 'uint8')

for i in range(WIDTH):
for j in range(HEIGHT):
for k in range(DEPTH):
if (img[i][j][k] != 255):
img[i][j][k] += noise[i][j][k]
plt.imshow(img)
plt.show()

GANs:

I have received a lot of interest with researching the use of Generative Adversarial Networks for data augmentation, below are some of the images I have been able to produce using the MNIST dataset.

As we can tell from the images above, they certainly do look like 3’s, 7’s, and 9’s. I am currently having some trouble extending the architecture of the network to support the 300x300x3 size output of the sneakers compared to the 28x28x1 MNIST digits. However, I am very excited about this research and am looking forward to continuing it!

Thanks for reading this article, hopefully, you now know how to implement basic data augmentations to improve your classification models!

--

--