Research has demonstrated the efficiency of Deep Convolutional Neural networks for many Computer Vision tasks in many domains such as autonomous vehicles, medical imaging, etc. However, these networks are heavily reliant on big quality data in order to perform remarkably well on real world problems. But, how big can a data be in order to have a good model? One thing is clear about data: the more high quality data we have, the better it is. To be clear of any doubts, we are going to increase the number of our training data with a technique called Image Augmentation in order for our model to avoid overfitting the training image data.
Overfitting, what is it, and when does it happen?
During the training of a model, the validation errors decrease if the model is able to learn the underlying task. At some point, the training error will keep decreasing, while the error on the test/hold-out data increases, this scenario is called overfitting, because the model keeps learning some patterns from the training data that do not generalize on the test data (illustration on image).

Problems related to limited image data
Access to a very limited number of high quality training data is one of the root cause of overfitting.
- The more high quality data we have, the better it is, but getting such data can be very expensive and time-consuming.
- Robust computer vision models should be invariant to minor changes (e.g. rotation, translation) in the images. If a model is capable of recognizing the elephant on the left, we might also want it to recognize those on the right as being elephants, unfortunately, this is not necessarily a given, unless that model is trained in a way to do so.

Image Augmentation Methods
Image augmentation is the process of increasing the size of the training data using images that already exist in the training set. As we can see from the image below, there are two main families for image augmentation
- Basic image manipulation: this one will be the focus of our article.
- Deep Learning Approaches using Generative Adversarial Network (GAN) augmentation, style transfer, and adversarial training: this one could be the topic for another time.

Basic image augmentation techniques and illustrations
These five main techniques of image transformations can be used to increase the size of the data. All these techniques will be followed with examples for better understanding.
- Geometric transformations
- Color space transformations
- Random erasing
- Kernel filters
- Mixing images
1. Geometric transformation
Before using this technique, it is important to understand what transformations make sense for the data in terms of preserving the label (e.g. for digit recognition, flipping 6 in some way might turn into 9).

2. Colorspace transformation
There are many possibilities related to this technique, and it offers more room for creativity in image transformation, but brightness and contrast are probably the most common techniques.

3. Random erasing
As the geometric transformation, this technique might also not preserve the label. But it is interesting because it forces the model to pay attention to the whole image instead of a subset. For example, the model could not recognize the elephant by only looking at its face, because that part might be blacked-out which in fact forces the model to look at the whole context. This technique is inspired by the dropout method which is a regularization technique that zeros out a random fraction of weights in a neural network layer. It helps also with overcoming issues with occlusion or unclear parts of test images (e.g. some part of an elephant hidden by a tree or something else).

4. Kernel filters
These techniques are mostly used to sharpen or blur images using filters. They use a NxN kernel under the hood, which is similar to the inner working of the Convolutional Neural Nets (CNN). They are also very good for making the model much more robust to action shots, especially when all the images in the training data are of similar sharp quality.

5. Mixing images
Pixel averaging (mixup) and overlaying crops (CutMix) are two ways of mixing images. Let consider combining the elephant with a forest image on the right.

Conclusion
From the previous techniques, we were able to get multiple images from a single image. Applying these techniques to all the images in the training data can consequently increase its size. However, it is very important to understand the data in order to apply the techniques that preserve the labels.
I hope you enjoyed your journey through this article. If you have any questions or remarks, I will be glad to welcome them for further discussions. For further readings do not hesitate to consult the following links:
https://link.springer.com/article/10.1186/s40537-019-0197-0
https://github.com/albumentations-team/albumentations
Bye for now 🏃🏾