
Lately, while working on my research project, I began to understand the importance of image augmentation techniques. The aim of this project is to train a robust generative model able to reconstruct the original images.
The problem addressed is anomaly detection, which is quite challenging since there is a small volume of data and, then, the model is not enough to do all the work alone. The common scenario is to train a two-network model with the normal images available for training and evaluate its performance on the test set, that contains both normal and anomalous images.
The initial hypothesis is that the generative model should capture well the normal distribution but at the same time, it should fail on reconstructing the abnormal samples. How is it possible to verify this hypothesis? We can look at the reconstruction error, which should be higher for abnormal images, while it should be low for the normal samples.
In this post, I am going to make a list of the best Data Augmentation techniques that to increase the size and the diversity of images present in the dataset. The main goal is to improve the performance and the generalization of the model. We are going to explore simple transformations, like rotation, cropping and Gaussian blur, and more sophisticated techniques, such as Gaussian noise and random blocks.
Image Aumentation techniques:
1. Simple transformations
- Resize
- Gray Scale
- Normalize
- Random Rotation
- Center Crop
- Random Crop
- Gaussian Blur
2. More advanced techniques
- Gaussian Noise
- Random Blocks
- Central Region
1. Introduction to Surface Crack dataset

In this tutorial, we are going to use the Surface Crack Detection Dataset. You download the dataset here or on Kaggle. As you can deduce from the name, it provides images of surfaces with and without cracks. So, it can be used as dataset for the task of anomaly detection, where the anomalous class is represented by the images with cracks, while the normal one is indicated by the surfaces without cracks. It contains 4000 color images of surfaces with and without defects. Both the classes are available in both training and test sets. Moreover, each dataset image is acquired at a resolution of 227 by 227 pixels.
2. Simple transformations
This section includes the different transformations available in the torchvision.transforms
module. Before going deeper, we import the modules and an image without defects from the training dataset.
Let’s display the dimension of the image:
np.asarray(orig_img).shape #(227, 227, 3)
It means that we have a 227×227 image with 3 channels.
Resize
Since the images have very high height and width, there is the need to reduce the dimension before passing it to a neural network. For example, we can resize the 227×227 image into 32×32 and 128×128 images.
resized_imgs = [T.Resize(size=size)(orig_img) for size in [32,128]]
plot(resized_imgs,col_title=["32x32","128x128"])

It’s worth noticing that we lose resolution when we obtain a 32×32 image, while a 128×128 dimension seems to maintain the high resolution of the sample.
Gray Scale
RGB images can be challenging to manage. So, it can be useful to convert an image to greyscale:
gray_img = T.Grayscale()(orig_img)
plot([gray_img], cmap='gray', col_title=["Gray"])

Normalize
The normalization can constitute an effective way to speed up the computations in the model based on neural network architecture and learn faster. There are two steps to normalize the images:
- we subtract the channel mean from each input channel
- later, we divide it by the channel standard deviation.
We can display the original image together with its normalized version:

Random Rotation
T.RandomRotation
method rotates the image with random angles.

Center Crop
We crop the central portion of the image using T.CenterCrop
method, where the crop size needs to be specified.

This transformation can be useful when the image has a big background in the borders that isn’t necessary at all for the classification task.
Random Crop
Instead of cropping the central part of the image, we crop randomly a portion of the image through T.RandomCrop
method, which takes in the output size of the crop as parameter.

Gaussian Blur
We apply a Gaussian blur transform to the image using a Gaussian kernel. This method can be helpful in making the image less clear and distinct and, then, this resulting image is fed into a neural network, which becomes more robust in learning patterns of the samples.

3. More advanced techniques
Previously examples with simple transformations provided by PyTorch were shown. Now we’ll focus on more sophisticated techniques implemented from scratch.
Gaussian Noise
The Gaussian Noise is a popular way to add noise to the whole dataset, forcing the model to learn the most important information contained in the data. It consists in injecting a Gaussian Noise matrix, which is a matrix of random values drawn from a Gaussian distribution. Later, we clip the samples between 0 and 1. The more the noise factor is higher, the more noisy the image is.

Random Blocks
Square patches are applied as masks in the image randomly. The higher the number of these patches, the more the neural network will find challenging the problem to solve.

Central Region
It’s a very simple technique to make the model generalize more. It consists of adding a patch block in the central region of the image.

Final thoughts:
I hope you found useful this tutorial. The intention was to make an overview of the image augmentation approaches to solve the generalization problem of the models based on neural networks. Feel free to comment if you know other effective techniques. I am going to explain how to exploit these techniques with autoencoders in the next post. The code is on Kaggle. Thanks for reading. Have a nice day!
Other related articles:
Albumentations: A Python library for advanced Image Augmentation strategies
How to quickly build your own dataset of images for Deep Learning
Disclaimer: This data set is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) by Çağlar Fırat Özgenel.
Did you like my article? Become a member and get unlimited access to new Data Science posts every day! It’s an indirect way of supporting me without any extra cost to you. If you are already a member, subscribe to get emails whenever I publish new data science and python guides!