Is Artificial Intelligence Racist? (And Other Concerns)

Mauro Comi
Towards Data Science
6 min readNov 12, 2018

--

When we think of concerns in Artificial intelligence, two main obvious connections are job loss and lethal autonomous weapons. While killer robots might be an actual threat in the future, the consequence of automation is a complicated phenomenon that experts are still actively analyzing. Very likely, as for any major Industrial revolution, the market will gradually stabilize. Advances in technology will create new types of jobs, inconceivable at the moment, which will be later disrupted by a new major technology takeover. We have seen this multiple times in modern history and we are probably going to see this again.

A third major field of concern is the ethical impact of AI. Here the question falls: is Artificial Intelligence racist?

Well, in short.. there is no short answer.

What about a long answer? Tales of Google, Seals, and Gorillas

In order to answer this question, we first need to define what Racism is.

Racism: The belief that all members of each race possess characteristics, abilities, or qualities specific to that race, especially so as to distinguish it as inferior or superior to another race or races. ~ Oxford Dictionaries

Racism is related to the generalization of specific characteristics to all the members of a race. Generalization is a key concept in Machine learning and this is especially true in classification algorithms. Inductive learning is related to derive general concepts from specific examples. The majority of techniques in supervised learning try to approximate functions to predict the categories of input values with the highest possible accuracy.

A function that fits our training set too closely generates overfitting. In practice, it is not able to derive a proper general function given different inputs. On the other hand, a function that doesn’t fit the dataset accurately leads to underfitting. Hence, the model generated is too simple to produce significant and reliable results.

Experts in the field know that classification is all about finding the trade-off between overfitting and underfitting. Indeed, the model needs to derive general rules from a specific training set. This clearly leads to a major problem: if the data used to train the model are biased, the model will produce a biased result.

A famous case showing the consequence of biased data is the mislabelling of two African-American young guys. Google Photos, which had recently implemented an automatic image labelling, classified the two teenagers as “gorillas” (all the references are reported at the end of the page). Google was heavily criticized, and someone starting to wonder whether a machine could be trained to be racist on purpose.

The Google team immediately apologized and a spokesperson tweeted: “ Until recently, [Google Photos] was confusing white faces with dogs and seals. Machine learning is hard”.

The actual reason for the misclassification is not due to racism at all, though. The cause of this error lies in the training set.

Superman, criminality and racism

In order to understand what we’ve just discussed, let’s see a simple example of misclassification.

Suppose we want to predict whether Clark Kent is a criminal or not. Here the dataset we have:

Dataset containing 5 elements

Our training set represents 5 people, belonging to three different races: Kryptonian, Human and Robot.

We are going to train a Decision Tree classifier to predict if Clark Kent, who’s a 31 Kryptonian Male guy, would be classified as Criminal or not.

First, we train the model:

clf = tree.DecisionTreeClassifier()
X_train = data[['Sex', 'Age', 'Race']]
Y_train = data[['Criminal']]
clf.fit(X_train, Y_train)

Then, we predict the category “Criminal” based on the trained model:

# 1 -> Male
# 31 -> Age
# 1 -> Kryptonian
pred = clf.predict([[1, 31, 1]])
print('Is Clark Kent a criminal? Prediction: ',pred[0])

As we can see, Clark Kent is classified as Criminal. Let’s check the importance of the features, in order to understand how variables influence the final output of the classifier.

Here it is. Based on the dataset that we used to train the model, the most important feature is the variable Race.

Bias in Computer Vision

This simple example shows the importance of data collection and data organization. When these two actions are performed poorly, ethical and cultural biases can be encoded in the machine learning model. As reported by a great article from Nature (link at the end), 45% of the most used image database in computer vision comes from the United States. China and India, accounting for 36% of the world population, represents just 3% of data in the ImageNet dataset. This unbalance unintentionally creates a bias and explains why computer vision algorithms label a photograph of a North Indian bride as ‘performance art’.

Joy Buolamwini, researcher at MIT, addressed the lack of diversity in the data used to train computer vision algorithms a few years ago. She noticed that, while the most famous facial recognition systems at MIT classified correctly the gender of almost every white person, the accuracy dropped drastically as skin shades got darker. The lowest accuracy was related to dark-skinned females, with an error rate of 34%.

How Microsoft corrupted a bot in 24 hours

Bias and errors do not only happen in Image classification tasks. Natural Language Processing is the field of Artificial Intelligence that focuses on human language processing. A common methodology shared by many NLP algorithms is mapping words to geometric vectors. This technique considers documents as a collection of vectors, allowing computations between words. Bolukbasi and colleagues, in their paper “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings”, show how a simple algorithm for analogies, trained on Google News articles, exhibits female/male gender stereotypes. As they report, the model states that ‘man’ is to ‘doctor’ as ‘woman’ is to ‘nurse’.

This reminds of a similar controversy: in 2016 Microsoft deployed TayTweets, a Twitter bot trained through casual conversations on Twitter. The idea was incredibly promising, due to the large amount of textual data available every second on Twitter. Anyway, needless to say, the agent started to tweet misogynistic and racist remarks in less than 24 hours. Who would have thought?

Racist bots and where to find them

TL;DR

And finally, here we are at the end of our analysis. The whole point of this article is to raise an ethical issue related to AI that is often overlooked. While scientists, engineers and data scientists need to address the unbalance in training sets, users and non-experts need to understand that Artificial Intelligence is based on Mathematics. And Math, as we all know, can be extremely complex. Neural networks, used in Image classification, are considered ‘black boxes’. The results they give are based on extremely high dimensional computations and cannot be fully controlled — even if companies are making a huge effort to understand the intermediate outputs, with amazing results (check my article about Neural Transfer Style, based on this concept).

Still, we have a last question to answer, which hopefully will be discussed in the comments below. Is AI racist?

Thanks for reading. For any comments or suggestions don’t hesitate to leave a comment!

--

--

Machine Learning PhD student, I like to learn new things, especially physics simulation, 3D, Reinforcement Learning, graphics | www.maurocomi.com