The world’s leading publication for data science, AI, and ML professionals.

Silver-Lining Clouds with AI

The Nebula project, a cloud segmentation application through convolutional neural networks

Thoughts and Theory

Authors

Yann Bernery, Ludovic Changeon, Cathy Baynaud-Samson, José Castro

Photo by James Wheeler from Pexels
Photo by James Wheeler from Pexels

Fish, Flower, Gravel, Sugar… This is by no means a random inventory, but rather cloud formation denominations. Named by their shapes, they may be of great interest to earth’s climate understanding.

This is related to contradictory characteristics from these shallow clouds: an important impact on earth radiation balance and reluctance to modelisation. Back in 2019, The Max Planck Institute for Meteorology challenged the data scientists community by means of a Kaggle competition. The objective: identify and locate the above mentioned four cloud classes on satellite images previously labelled by a panel of experts. The actual project is not pretending to solve climate change directly, though we imagine this can contribute to a better understanding of climate.

Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material
Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material

The below article will present one solution to this image segmentation case. See references for underlying assumptions. The project and its results were part of a Data Scientist training course, from DataScientest institute.

More information about the dataset is given at the end of this article.

Feel free to explore and play with the demo to see model in action: Nebula project demo.

Analysis limitations

From divergences between specialists on the location and nature of many cloud formations, we inferred that class characterisation may be fuzzy. Exploratory analysis proved us right. Neither cloud surfaces, nor image white proportions enabled us to identify each class’s eigen characteristics. An analysis of the image’s grayscale finally convinced us into leaving pixel luminosity apart. In order to locate classes we would need to look for patterns within the image and as a result use smarter computing techniques: Convolutional Neural Networks (aka Cnn).

Unity is strength

Memory constraints drove us into using networks with as few parameters as possible. By means of iteration, we selected two of them: a ResNet 18 backbone fitted inside a UNet architecture and an EfficientNet B1 backbone fitted inside a LinkNet architecture. These two models count respectively 14 and 8.5 million parameters. A core idea of the project was to take advantage of two different models in order to build on their respective strengths. In this frame, loss functions as well as model optimizers were chosen as different as their separate architectures were.

The table below summarizes the main characteristics of the resulting models.

Diagrams by the authors
Diagrams by the authors

Before usage, the images were all resized from 1400×2100 pixels down to 320×480, then augmented by means of vertical and horizontal symmetries, rotation below 10° and gamma corrections. This last augmentation was specifically chosen to compensate for solar strides apparent on numerous images. The first model benefited from a pre-training phase on noise filtered images during a limited number of epochs. This would help the model to focus on lower level cloud characteristics during main training.

Such a fine wedding

The evaluation of both models resulted in close mean performances. Each one being a little stronger in some classes. As a consequence, final predictions resulted from a wedding or consensus. A simple arithmetic mean was performed on each pixel’s predicted probability to belong to a class, across models. Nevertheless, it appeared that both cloud fuzzy patterns and hardware limitations would require correction. This was performed by means of two post-processing techniques: pixel activation and surface filtering.

Predictions pipeline diagram by the authors, Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material
Predictions pipeline diagram by the authors, Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material

Light my pixel

Predictions at pixel level are identified by a probability it belongs to the current class. Default threshold is 0.5, which means a pixel belongs to the class whenever probability is above 50%. Now it is easy to understand that cloud formations have different structures: Sugar class seems sparser than Fish. This results in a lower activation threshold for Sugar, or a wider activated area.

Optimal thresholds were identified through a sampling over 30 batches of 64 images from the training set.

Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material
Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material

Cleaning the surfaces

The masks resulting from previous operations may be of all possible sizes between one pixel to the entire image. In order to filter digital artifacts from actual clouds a calibration operation was elaborated. A simple and arbitrary rule was computed, so as to avoid being caught red-handed while overfitting the data. Based on the training set, the 15th percentile surface size was identified for each class. Finally, all predicted surfaces below the threshold were deleted.

A few predictions

Considering an image is worth a thousand words, we gathered a few predictions below. First row shows Sugar (yellow) and Gravel (green) surfaces filtered for being too small. Second row shows a slight resizing of the Sugar class, resulting from a lowered activation threshold.

Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material
Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material

Evaluations and results

Evaluations were performed based on a reserved 224 image set, unknown to both models. Evaluation synthesis lies in the table below.

Note: As mentioned in results table, selected metrics is a Dice coefficient, or f1-score
Note: As mentioned in results table, selected metrics is a Dice coefficient, or f1-score

Except for the Fish class, we can see that both the model set and the post-processing improve raw models’ individual performances. Even if improvements seem substantial, let’s just bear in mind that tests were performed on a limited 224 image set.

As a bonus, a late submission towards Kaggle competition confirmed our approach performance by awarding us a virtual 74th position (out of 1500 contributors). Public score was 0.66173, where private score was 0.66016.

Our strategy showed accurate results over test and training data sets. Model inference over totally different images was the last test we performed. In this frame, we scanned other locations from all over the world. The model was able to identify the clouds it was trained over and results seemed coherent. See a few examples below.

Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material
Photos by NASA on Earth Observing System Data and Information System (EOSDIS), not copyrighted material

As a conclusion, we put forward that the proposed solution is performant in regards to hardware constraints. Further improvements are still possible in the present frame as numerous tracks still exist. For example, perform a thorough analysis of image interpolation methods used during image resize, or use gradient accumulation technique in order to enable more dense model architectures. Finally, we already obtained strong evidence that our model has the potential to enter the top 10 for the abovementioned Kaggle competition.

Conclusion

Gathering new data from Satellite Imagery, data so important as to enable us to understand our planet, analysing even the fuzziest entities as clouds with human-like pattern recognition capability, and applying this to a virtually unlimited amount of images…

All this could be achieved thanks to deep learning techniques available to everybody. Combined with Big Data, everything converges towards a major expansion of the satellite imagery domain, which could dramatically improve our environment understanding.

Acknowledgements

First and foremost, we wish to thank the whole DataScientest team for their invaluable help and support during the past months.

The data scientist degree course followed at the DataScientest.com institute enabled us to acquire a strong and diverse knowledge. This was a prerequisite to be autonomous and critical on data science projects like the present one.

Our team was built around four people, unknown to each other. Due to geographical distance and lockdown, we had very little chance to meet, and… we hadn’t yet. Everyone’s investment was the key to success during our eight-month project.

Another key to successful data science projects is data management. Taking advantage of Google Colab / Drive as the main platform and Github as the versioning tool for notebooks and libraries, we were able to support the whole project from initial learning steps to the more sophisticated final ones.

We acknowledge the use of imagery from NASA Worldview, part of the NASA Earth Observing System Data and Information System (EOSDIS).

And last but not least, thank you for reading. We hope you enjoyed the experience as much as we did in achieving the entire work. Feel free to contact us for any request for additional information.

Dataset

The Nebula project, which is discussed in this article, was carried out for educational purposes only.

Dataset used for this project is available from Kaggle competition homepage.

Data were chosen from three regions, spanning 21 degrees longitude and 14 degrees latitude. More details are available in the study from Rasp, Schulz, Bony and Stevens, mentioned in the references section of this article [1].

As mentionned in Kaggle data description, the images that compose the dataset were downloaded from NASA Worldview. Therefore data is subject to the following terms and conditions: NASA Earth science data.

References

[1] Rasp, Schulz, Bony and Stevens, Combining crowd-sourcing and deep learning to explore the meso-scale organization of shallow convection (2019)


Related Articles