Exploring Data Augmentation with Keras and TensorFlow
A guide for using Data Augmentation with your next Deep Learning Project!
Data augmentation is a strategy used to increase the amount of data by using techniques like cropping, padding, flipping, etc.
Data augmentation makes the model more robust to slight variations, and hence prevents the model from overfitting.
It is neither practical nor efficient to store the augmented data in memory, and that is where the ImageDataGenerator
class from Keras (also included in the TensorFlow’s high level api: tensorflow.keras) comes into play. ImageDataGenerator
generates batches of tensor image data with real-time data augmentation. And the best part? It’s just one line of code!
The output images generated by the generator will have the same output dimensions as the input images.
Below is an auxiliary script that we will be using to show visually what all is achievable using the ImageDataGenerator
class.
1. Rotation
By specifying the rotation_range
, the data generated is randomly rotated by an angle in the range of +rotation_range
to -rotation_range
(in degrees).
2. Width Shifting
The width_shift_range
is a floating point number between 0.0
and 1.0
which specifies the upper bound of the fraction of the total width by which the image is to be randomly shifted, either towards the left or right.
3. Height Shifting
Exactly like width shifting, except that the image is shifted vertically instead of horizontally.
4. Brightness
The brightness_range
specifies the range for randomly picking a brightness shift value from. A brightness of 0.0
corresponds to absolutely no brightness, and 1.0
corresponds to maximum brightness.
5. Shear Intensity
Shear transformation slants the shape of the image. This is different from rotation in the sense that in shear transformation, we fix one axis and stretch the image at a certain angle known as the shear angle. This creates a sort of ‘stretch’ in the image, which is not seen in rotation. shear_range
specifies the angle of the slant in degrees.
6. Zoom
A random zoom is obtained by the zoom_range
argument. A zoom less than 1.0
magnifies the image, while a zoom greater than 1.0
zooms out of the image.
7. Channel Shift
Channel shift randomly shifts the channel values by a random value chosen from the range specified by channel_shift_range
.
8. Horizontal Flip
The generator will generate images, which on a random basis, will be horizontally flipped.
9. Vertical Flip
Instead of flipping horizontally, we can also apply a vertical flip.
But what about the points where we don’t have any value?
We have several options among which we can choose how we want these regions to be filled.
1. Nearest
This is the default option where the closest pixel value is chosen and repeated for all the empty values. (E.g. aaaaaaaa|abcd|dddddddd)
2. Reflect
This mode creates a ‘reflection’ and fills the empty values in a reverse order of the known values. (E.g. abcddcba|abcd|dcbaabcd)
3. Wrap
Instead of a reflect effect, we can also create a ‘wrap’ effect by copying the values of the known points into the unknown points, keeping the order unchanged. (E.g. abcdabcd|abcd|abcdabcd)
4. Constant
If we want to fill all the points lying outside the boundaries of the input by a constant value, this mode helps us achieve exactly that. The constant value is specified by the cval
argument.
But there’s more!
There are some additional benefits that you can harness straight out of this class. Some examples include zero centering the data (featurewise_center
, samplewise_center
) and normalization (featurewise_std_normalization
, samplewise_std_normalization
). These variables can be set by passing their boolean value to the ImageDataGenerator
class. We can also rescale the values by specifying the rescale
argument, which gets multiplied to all the values.
Also, there is an argument preprocessing_function
using which you can specify your own custom function to perform processing of the image. How cool is that!
If you liked this article, you might also like VGGNet vs ResNet: A lucid answer to the Vanishing Gradient Problem!