[ CVPR 2014 / Paper Summary ] The Secrets of Salient Object Segmentation

Published in

Towards Data Science

6 min readApr 28, 2018

I already knew what segmentation was, however I had no idea what Salient Object Segmentation was. So I decided to change that by reading one of the papers published in CVPR 2014.

Please note that this post is for my future self to look back and remember what this paper was about.

Salient Object Detection / Fixation Prediction

Salient → most noticeable or important. (from google search)

So from the image above we can conclude that salient object detection is a term used to segment the most important objects in the image.

And from above paper we can conclude that the term fixation prediction is used when we wish to predict where the human eye sees the most while viewing an image.

Abstract

This paper claims that there is a huge bias (called DataSet bias) in the way we currently evaluate salient object benchmarks. (Keep in mind this paper was published in 2014.) Also, the paper introduces a novel method of salient object segmentation, that outperforms existing models.

Introduction

So before we move on there is one term we need to know.

Dataset Design Bias → a specific type of bias that is caused by experimenters’ unnatural selection of dataset images. (For more information please click here)

The authors of this paper tell us two things. Saliency in computer vision is not a well defined term, and it can be understood as either a) Segmentation where we segment the most important object in the image, or b) Eye fixation prediction. However, existing methods suffers from two problems.

If a model focuses on one problem, it tends to overlook the connection to the other side.
If a model is bench-marked on one dataset, it tends to over-fit to that dataset creating a inherent bias.

Related Works

Again the paper describes what each problem is, for example Fixation prediction is eye gaze prediction, and for Salient object segmentation is the task trying to segment the most important objects in the given image. The paper also discusses the connection between object recognition task (where we try to find object in their own each class) with Salient object segmentation. Finally, the authors discuss one dataset bias, which in visual saliency analysis center bias is a huge problem. This is causes when the participant of the experiment is focused on viewing the center of the image, and photographers wanting to align the focus in the center of the image.

DataSet Analysis

This section of the topic is huge, mainly because it is one of the core contribution of this paper. And I have to say, I don’t know in details of all of the statistical analysis they did, however, once I research about them. I will surely make another blog post.

Psycho physical experiments on the PASCALS data-set
Here the authors have performed some experiments to gather ground truth data for fixation Prediction on PASCAL 2010 data set.

Evaluating dataset consistency
To compare the level of agreement among different labelers (from previous experiment) the authors have done some extensive analysis to know the Inter-subject consistency. (For both salient object segmentation as well as fixation prediction). And one interesting fact the authors found was….. (shown below)

Benchmarking
Here the authors have compared many state of the art algorithms that performed salient object segmentation and found out that when the algorithm was not trained on FT dataset their performances decreased significantly.

Dataset design bias
The authors really went all out in this section, they performed many statistical analysis such as comparing Local color contrast, Global color contrast, Local gPB boundary strength, and Object size. And they have summed the founding into one paragraph

Basically, in FT data set there is a strong contrast between the object that we want to segment and the back ground image of that object. That makes it easier for the model to learn how to segment an object, but fails to generalize well.

Fixations and F-measure
Here the author discuss the effect of center bias problem, and the methods of many state of the art algorithms to counteract the center bias problems. For example, in AWS and SIG they benchmark their algorithm performance in s-AUC which cancels out the center bias problem.

From Fixations to Salient Object Detection

For this part the image above, does an amazing job of describing step by step of the authors method to perform salient object segmentation. (And they claim that it is simple, but it really isn’t simple at all in my opinion….)

Perform Unsupervised method of Segmentation using CPMC
Obtain the spatial distribution of fixations within the object
Have a function where given a proposed object candidate mask and its fixation map (from step 1 and 2), estimates overlapping score (intersection over union) of the region with respect to the ground-truth.

Conclusion

Again, this paper have demonstrated the fact that there is a strong correlation between fixation prediction and salient object detection. Using this knowledge they have proposed a novel method of performing salient object segmentation, where they first perform segment generation process, then a saliency scoring mechanism using fixation prediction. Finally, the paper also describes a bias in the dataset, which in visual saliency analysis center bias is a huge problem.

Final Words

This paper contains a lot of information, and serves as a very good introduction to salient object segmentation task as well as eye gaze tracking. Very interesting…

If any errors are found, please email me at jae.duk.seo@gmail.com, if you wish to see the list of all of my writing please view my website here.

Meanwhile follow me on my twitter here, and visit my website, or my Youtube channel for more content. I also did comparison of Decoupled Neural Network here if you are interested.

Reference

Li, Y., Hou, X., Koch, C., Rehg, J. M., & Yuille, A. L. (2014). The secrets of salient object segmentation. Georgia Institute of Technology.
MSRA10K Salient Object Database. (2014). 南开大学媒体计算实验室. Retrieved 28 April 2018, from https://mmcheng.net/msra10k/
(2018). Arxiv.org. Retrieved 28 April 2018, from https://arxiv.org/pdf/1611.09571.pdf
How to recognize exclusion in AI — Microsoft Design — Medium. (2017). Medium. Retrieved 28 April 2018, from https://medium.com/microsoft-design/how-to-recognize-exclusion-in-ai-ec2d6d89f850
The PASCAL Visual Object Classes Challenge 2010 (VOC2010). (2018). Host.robots.ox.ac.uk. Retrieved 28 April 2018, from http://host.robots.ox.ac.uk/pascal/VOC/voc2010/
Frequency-tuned Salient Region Detection. (2018). Ivrlepfl.ch. Retrieved 28 April 2018, from http://ivrlwww.epfl.ch/supplementary_material/RK_CVPR09/
Image Signature: Highlighting Sparse Salient Regions — IEEE Journals & Magazine. (2018). Ieeexplore.ieee.org. Retrieved 28 April 2018, from https://ieeexplore.ieee.org/document/5963689/
Garcia-Diaz, A., Leboran, V., Fdez-Vidal, X., & Pardo, X. (2012). On the relationship between optical variability, visual saliency, and eye fixations: A computational approach. Journal Of Vision, 12(6), 17–17. doi:10.1167/12.6.17

[ CVPR 2014 / Paper Summary ] The Secrets of Salient Object Segmentation

Written by Jae Duk Seo