The world’s leading publication for data science, AI, and ML professionals.

How to implement augmentations for Multispectral Satellite Images Segmentation using Fastai-v2 and…

Improve the performance of your deep learning algorithms with multispectral image augmentations and Fastai v2.

Figure 1: Augmentations applied to a Landsat 8 patch and its corresponding cloud mask. Image by Author.
Figure 1: Augmentations applied to a Landsat 8 patch and its corresponding cloud mask. Image by Author.

Update

For information about the course Introduction to Python for Scientists (available on YouTube) **** and other articles like this, please visit my website cordmaur.carrd.co.

Introduction

We know that image Augmentation is a key factor for computer vision tasks. It helps the algorithm to avoid overfitting, as well as it limits the need for huge training datasets [1]. Most deep learning frameworks have a vision module that implements augmentation "out-of-the-box", as it is the case of Keras, PyTorch and Fastai libraries. The problem starts to rise when we need to feed the model with images that don’t match the 3-channel standard (RGB). That’s the case of most remote sensing applications (ex. Figure 1) and a number of other areas.

To overcome this limitation, I will show how to implement a multispectral augmentation using the Albumentations library [2] and plug it into a Fastai v2 DataBlock for further training. The principles about how to create a DataBlock for satellite images can be found in my previous story "How to create a DataBlock for Multispectral Satellite Image Segmentation with the Fastai-v2".

1- The dataset

The data we will be using is a public dataset available at Kaggle, called "95-Cloud: Cloud Segmentation on Satellite Images" , that contains training patches extracted from 57 Landsat 8 scenes. This dataset is, in fact, an extension of a previous Kaggle dataset that has been used to train a very simple Fastai v1 model (here). The patches are 384×384 size and contains 4 bands – Red, Green, Blue and Near Infrared. Additionally, there is a patch with the ground truth that mark the clouds. As our objective is just to show how to implement the augmentation, without further considerations about accuracy, we will stay with the latest version only.

Figure 2— Kaggle dataset. Image by Kaggle (https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images)
Figure 2— Kaggle dataset. Image by Kaggle (https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images)

To make life easier, all the code shown here is ready to be used in a Kaggle notebook (here), so we will start by installing the necessary dependencies:

# update torch and torch vision
!pip install -q torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
# install kornia, we will give it a try to accelarate our preprocessing
!pip install -q --upgrade kornia
!pip install -q allennlp==1.1.0.rc4
# and install fastai2
!pip install -q --upgrade fastai

2- Opening the Images

Besides the problem that the vision libraries don’t support multichannel images, this dataset keeps each band into a separate folder. So, in order to open an image, we first need to correct the paths for each band and then collate them into a single 4-channel image. Instead of subclassing the TensorImage class, as we did in the previous story, I will try to make things easier here and open the image as a tensor. The drawback of this approach is that we will not be able to use internal visualization functions of Fastai like DataLoader.showbatch(), as it doesn’t know how to display the 4 bands.

The first step will be to create three base functions:

  1. Open a TIF file and return it as a PyTorch’s Tensor;
  2. Given a filename (suppose it is the Red band), return the names of the other three bands (Green, Blue and Nir);
  3. Open the 4 bands at once and collate them into a single image. To do this, we will concatenate the images using the first dimension (or axis)

    Once our functions are defined, we will test them by passing one item into a Pipeline. The Pipeline is a sequence of functions that are applied to the one item to transform it the way we want.

To load the items, we will consider that our base images are in the red folder, and we will then get the other bands automatically. So, our pipeline will be composed with two functions: 1- get_filenames and 2- open_ms_tif. Our final image will have shape (4, 384, 384). To display it with matplotlib will do a final permutation in the dimensions to put the channels in the last axis to be (384, 384, 4), and also slice out the Nir band using [..., :3].

Considering that our final objective is to segment the clouds in the image, we have to apply the same augmentations to the ground truth. So, a similar procedure will be done to open the mask.

torch.Size([4, 384, 384]) torch.Size([384, 384])
Figure 3: Code output displaying patch and corresponding cloud mask. Image by Author.
Figure 3: Code output displaying patch and corresponding cloud mask. Image by Author.

As we can see, our pipeline worked just fine. I used the .show() method of TensorMask with context ctx argument, just to show that in Fastai you can force the output to any context. Another interesting command is the partial, that returns reference to a function with a given set of parameters pre-filled.

3- Creating a Dataset and a Dataloader

Before going into the augmentations, we will first create a dataset and a dataloader, just to check if everything is working as desired. Note that we don’t need to specify the get_items function in the DataBlock, because our source is already a list of items. We will also define a function show_img() to display the multichannel tensor.

torch.Size([4, 4, 384, 384]) torch.Size([4, 384, 384])
Figure 4: Code output showing a batch sample. (Top) RGB images | (Bottom) Cloud masks. Image by Author.
Figure 4: Code output showing a batch sample. (Top) RGB images | (Bottom) Cloud masks. Image by Author.

4- Augmentations

For the augmentations we will be using the Albumentations [2] library. There is a huge list of possible augmentations, separated in different classes like Pixel level and spatial-level transformations. For this tutorial we will keep it simple and use just the basic shift, flip, scale, rotate, brightness and contrast. The full list can be accessed by their online documentation (here).

One important aspect of the Albumentations library is that it supports augmentation for segmentation and for object detection. That means it can apply the corresponding augmentation that was applied to the image to its target (mask or bounding box). That’s a crucial point, as we need to keep our cloud masks matching the augmented images.

Fastai will apply the augmentation to a tuple (X, Y), where X is the image and Y is the mask. To make it work within the framework it is necessary to subclass the ItemTransform class and create the encodes() method. To make it generic, our subclass will receive the desired transformation at the time of instance creation, like so:

Figure 5: Code output showing: (Left) the original item (image and mask) | (Right) the augmented versions of image and mask. Image by Author.
Figure 5: Code output showing: (Left) the original item (image and mask) | (Right) the augmented versions of image and mask. Image by Author.

Note that there is a split_idx=0 defined within the class. That’s to tell Fastai to augment just the training dataset, and not the validation dataset. Now that we have set up our transformation class, let’s use it in the DataBlock. We will recreate the DataBlock, now with item_tfms parameter set to aug. We will then ask the dataloader to create one item multiple times to see how it goes.

Figure 6: Code output demonstrating the augmentations being applied to the same patch at the moment they pass through the dataloader. Image by Author.
Figure 6: Code output demonstrating the augmentations being applied to the same patch at the moment they pass through the dataloader. Image by Author.

Conclusion

As we saw in this story, implementing data augmentation for multispectral satellite images is just a matter of finding the right tools. In this regard, Albumentations is a great companion, as it can deal with many channels and augment the targets as well.

As mentioned before, the notebook with all the code can be found at Kaggle (here). There, it will be also possible to find a comparison of the learning accuracy with and without the augmentations.

Hope you enjoyed.

References

[1] Shorten, C., Khoshgoftaar, T.M., 2019. A survey on Image Data Augmentation for Deep Learning. J Big Data 6, 60. https://doi.org/10.1186/s40537-019-0197-0.

[2] Buslaev, A., Iglovikov, V.I., Khvedchenya, E., Parinov, A., Druzhinin, M., Kalinin, A.A., 2020. Albumentations: Fast and Flexible Image Augmentations. Information 11, 125. https://doi.org/10.3390/info11020125


Related Articles