The world’s leading publication for data science, AI, and ML professionals.

What you see is what you’ll get: Twitter’s new strategy for displaying images on the timeline

Summarizing Twitter's paper on their Image Cropping Algorithm

Photo by Alexander Shatov on Unsplash
Photo by Alexander Shatov on Unsplash

In 2020, there was a lot of furore on Twitter over the biased nature of their image cropping algorithm. The Twitterati complained that it was biased towards white-colored individuals and was objectifying women’s bodies. Twitter promised to look into this issue and several others to ensure responsible AI practices. This article summarizes the issues with Twitter’s Image Cropping algorithm, the findings of their research team, and how they intend to bring more transparency around their existing machine learning (ML) systems. The content in the article is based on the paper published by Twitter.

Paper: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Authors: Kyra Yee, Uthaipon Tantipongpipat, Shubhanshu Mishra


Image Cropping and its business use case

Millions of images are uploaded to various social media platforms every day in all sorts of shapes and sizes. These uploaded images must be cropped to maintain UI consistency and ensure that the most relevant part of the images is visible in the user’s timeline. However, cropping is a tedious process, and to eliminate hours of laborious work, companies automate this process, primarily using machine learning. Twitter used a saliency algorithm to achieve the same.


Image Cropping at Twitter

Twitter is a widely known social media platform, and image sharing lies at the core of it. Before 2018, Twitter used face detection to crop images. However, the method had numerous shortcomings.

Image Cropping based on face Detection | Image by Author
Image Cropping based on face Detection | Image by Author

In 2018, Twitter announced that they would employ speedy Neural Networks for Smart Auto-Cropping of Images. This system relies on the saliency technique, which identifies the most important regions of an image from a perceptual point of view. In other words, the saliency algorithm identifies those parts of an image that a person is most likely to observe in an image. Naturally, this would include texts, faces, animals, objects, and backgrounds with high contrasts.

Image Cropping based on Saliency | source: https://arxiv.org/pdf/2105.08667.pdf
Image Cropping based on Saliency | source: https://arxiv.org/pdf/2105.08667.pdf

Twitter’s saliency model utilizes a Deep Learning architecture DeepGaze II. However, such a model alone is computationally expensive and slow. To speed up the process, Twitter also employs a combination of techniques called fisher pruning and knowledge distillation on top of a deep learning network. In the words of the authors:

Together, these two methods allowed us to crop media 10x faster than just a vanilla implementation of the model and before any implementation optimizations. This lets us perform saliency detection on all images as soon as they are uploaded and crop them in real-time.


Issues with Twitter’s Image Cropping Algorithm

Around the fall of 2020, many tweets emerged complaining about the way their images were being displayed on Twitter. That is to say, the image cropping algorithm was favoring white-skinned people over dark-skinned ones. Not only this but there were also concerns regarding the male gaze and representation bias, i.e., not giving people the freedom to choose how they would want to be presented on the platform.

Twitter decided to look into these concerns and re-evaluate its algorithm.

In order to address these concerns, we conduct an extensive analysis using formalized group fairness metrics. We find systematic disparities in cropping and identify contributing factors, including the fact that the cropping based on the single most salient point can amplify the disparities.


Findings and the Changes proposed.

The paper explains how Twitter performed elaborate tests to check for potential gender and racial bias and the male gaze in their algorithm.

Dataset & Methodology

The dataset used is the WikiCeleb dataset consisting of images and labels of 4073 celebrities as recorded on Wikidata. The data is divided into subgroups based on race and gender, resulting in four subgroups → : Black-Female, Black-Male, White-Female, and White-Male.

Results- gender and racial disparity

The results show a strong gender favor for females over males and favor for white over Black individuals.

Findings from the Image Cropping analysis by Twitter | source: https://arxiv.org/pdf/2105.08667.pdf
Findings from the Image Cropping analysis by Twitter | source: https://arxiv.org/pdf/2105.08667.pdf

Results- the male gaze

For the male gaze issue, the results showed for every gender, 3 out of 100 images have the crop at a place other than the head. The crops were not on the face; it is usually at those areas on an image that displays numbers like sports jerseys. This behavior was similar for all the genders.

Analyzing the male gaze issue in Twitter's Image cropping algorithm | source: https://arxiv.org/pdf/2105.08667.pdf
Analyzing the male gaze issue in Twitter’s Image cropping algorithm | source: https://arxiv.org/pdf/2105.08667.pdf

The paper also discusses additional points that amplify disparate effects like dark backgrounds, dark eyes, higher Variability, etc. Finally, here’s what the authors conclude

However, we demonstrate that formalized fairness metrics and quantitative analysis on their own are insufficient for capturing the risk of representational harm in automatic cropping. We suggest the removal of saliency-based cropping in favor of a solution that better preserves user agency. For developing a new solution that sufficiently address concerns related to representational harm, our critique motivates a combination of quantitative and qualitative methods that include human-centered design.

As a result, users will get an accurate preview of how their images will appear when they Tweet a photo. After testing this change, Twitter finally rolled out this updated feature in May 2021.


Takeaways

The paper highlights how Responsible AI has to be at the core of any Machine Learning system. Today’s widespread use of AI systems makes it imperative to ensure that their predictions should not be discriminatory and unfairly discriminating. Critically evaluating the machine learning systems and carefully thinking about their consequences is an important starting step in this direction.


References


Related Articles