Bias in AI: Much more than a Data problem

While data bias is a very well-known cause for AI unfairness, it is definitely not the only one.

Published in

Towards Data Science

3 min readJul 6, 2020

Obama upsampled to a white person by an AI, originally published as part of this tweet: https://twitter.com/Chicken3gg/status/1274314622447820801?s=20

There has been a lot of discussion during the last days around bias in the AI community, especially after Yann LeCun joined the conversation after this tweet:

Original tweet about the bias of PULSE, a new Photo Upsampling algorithm

PULSE, the algorithm that created this image, works by using Self-Supervised training to search a space of high-resolution artificial images generated using a GAN and identify ones that downscale to the low resolution image. A bias problem with the algorithm was quickly found: given downsampled (but still very recognizable) images of famous non-white people, the algorithm still upsampled them to produce white people outputs. Here is another clear example:

Original AOC photo, downsampled image and PULSE output.

Lecun’s replied to that tweet stating that ML systems are biased because of the quality of Data.

Yann Lecun stablishing a relationship between data bias and AI bias

Part of the AI community reacted to this tweet claiming that Lecun’s statement implied that ML systems are biased only when data is biased. One of the most active experts who engaged in the conversation was Timnit Gebru, technical co-lead of the Ethical Artificial Intelligence Team at Google.

Timnit Gebru’s response to Yann Leccun’s tweet

I must admit that following that conversation allowed me to access a lot of interesting resources about the topic of AI Ethics, and, in particular, bias. Let me share with you some of the most interesting ones I found:

Fairness Accountability Transparency and Ethics in Computer Vision by Timnit Gebru and Emily Denton. Specially relevant in the current context on the use of Vision Computing for surveillance purposes.
Our Data Bodies project: a very small group committed to know “how communities’ digital information is collected, stored, and shared by government and corporations”, specially in marginalized neighborhoods.
fast.ai AI Ethics resources, compiled by Rachel Thomas. A really interested set of videos, links and introductions to AI Ethics experts and institutions worth following.
About the political design of AI systems, an interesting article debunking the idea that AI bias is only caused by biased Data.

One of my short articles in Towards Data Science called “Computer says no” included a high level view of the framework we used at my current company to deal with AI Ethics. As you can see, there is a lot to be considered to achieve true positive social impact and fairness through AI.

everis AI Ethics framework, all rights reserved.

I am really excited to be a Founding Editorial Board Member of the new Springer’s AI and Ethics journal, in which I will have the pleasure to explore this topic in detail working together with an amazing group of experts.

Do you have any resource that you think is highly relevant in the AI Ethics space? I would be more than happy to update this article recurrently with your collaborations.

If you enjoyed reading this piece, please consider a membership to get full access to every story while supporting me and other writers on Medium.

Bias in AI: Much more than a Data problem

While data bias is a very well-known cause for AI unfairness, it is definitely not the only one.

Written by David Pereira