Nature vs Nurture. In AI.

Published in

Towards Data Science

7 min readMay 19, 2019

https://en.wikipedia.org/wiki/Francis_Galton

When Sir Francis Galton, cousin of Charles Darwin, popularised the term “Nature versus Nurture” in the late 1800s, he was of course thinking about how much a human being’s behaviour is determined by their inherited genes, and how much is determined by their experiences throughout life. The renowned eugenicist studied human traits, and developed the statistical concept of regression (used heavily in Machine Learning) as part of his quest to improve the human gene pool, and concluded that humans are very strongly a result of their heritage.

While Galton’s proposals, such as paying individual from eminent families to have children together, are not in favour today’s egalitarian society, the concept of Nature vs Nurture continues to be the subject of much debate. In the entertaining 2007 book, Freakonimics by Steven Levitt and Stephen Dubner it was claimed that having 50 books in your house improved a child’s test scores by 5%. 100 books improved it by a further 5%. A clear sign of Nurture helping the child. But the same study also showed no correlation between actually reading the books. The theory is that parents who buy books are often those who pass genes for intelligence on to their children — Nature, not Nurture.

What about AIs?

But what about Nature vs Nurture in Artificial Intelligences? Does it even make sense to ask such a question? Nature didn’t evolve the AI software. It doesn’t reproduce and pass down traits beneficial for survival. And what would Nurture even mean for a machine?

Nature

AI software may not be at the stage of self-reproduction and evolution and survival of the fittest, but with the help of human programmers, it most certainly exhibits some of the properties of an evolving species. One competitive arena for AI algorithms is the kaggle website. Kaggle hosts competitions for data scientists to try to code the best AI, to make the best predictions. Data scientists compete with each other in these competitions, with the winners standing to make huge amounts of money. How huge? The Department of Homeland Security ran a competition with $1.5M in prize money to improve the accuracy of the Department of Homeland Security’s threat recognition algorithms for example.

Few data scientists are writing truly original algorithms themselves. Instead, most are combining existing open source code to make the best algorithms for their particular task. As new open source algorithms are released, if they are successful then more and more kaggle competition entries will use them, and weaker algorithms drop down in popularity. Good algorithms are enhanced, and weak ones are ignored. One famous example is xgboost, created by Tianqi Chen in 2014. It is a gradient boosting algorithm, which allows a number of weak algorithms to be combined together to get good results. Its ability to support regularization made it significantly better than earlier algorithms, and now it’s arguably the most popular algorithm used by kagglers.

There are obvious similarities to survival of the fittest in animals — the strongest algorithms go on to be used extensively, while weaker ones become orphaned git repositories. The strongest algorithms get lots of pull requests, and get enhanced, growing even stronger, while weak algorithms go unmaintained. It’s a clear example of Nature driving AI forwards.

Nurture

We’ve see how natural selection makes the best AI code thrive, and the weakest dies. How could nurturing AI possibly work? In humans, nurturing works by parents modelling behaviour for children. This can be virtuous behaviour like eating up your vegetables or reading books (or perhaps just owning books), or it can be bad habits like smoking. Studies show that a teenager is twice as likely to smoke if their parents do, while other studies show that the strongest predictor of a child eating adequate vegetables is that their parents also do. But how can this apply to AI?

AI today is primarily achieved by Machine Learning. This is where a computer algorithm is fed labelled data to allow it to build a model of what the world looks like. Or at least one tiny subset of the world. For example, we teach the machine that a cat looks like this, and a dog looks like that. Or we tell it who defaulted on a loan and who didn’t, and what the household traits were of those people. The computer is told what happened in the past so that it can predict what will happen in the future. That should allow us to build a true model of the world. We remove the human personal opinion from future decisions. We base them on the facts that happened in the past, not on our own learned prejudices. Given that an AI can only make a useful model of the world if we have trained it on data, it very much seems like it is more strongly dependent on how we nurture it — what do we teach it about the world.

Bias

But what happens if what it was told in the past had some bias? Some human fallibility that influenced the world. In the above examples, someone person had to decide what was a cat and what was a dog. Perhaps some of the pictures were labelled wrongly.

If there were just a few random mistakes, that might not be a big deal. But what if it was more malicious? When historic census data was digitised by prisoners working in jails in the UK, they entered unsavoury occupations for law enforcement officers. It’s not just criminals who do this too. Most image labelling is done by people online. Think about some of the people you have seen on social media, and tell me you can’t imaging any of them mislabelling for kicks.

Or perhaps our data set has been trained well for cats and dogs, but what happens when I show it a sheep for the first time? Or even the hundredth time? It will label it as a cat or a dog because that’s all that it has been trained on. That’s similar to what happened in 2015 when Google started labelling black people as “Gorilla”. Google had no intention to do so, but the lack of sufficient training of image recognition on black people caused it to not learn properly.

When It Comes to Gorillas, Google Photos Remains Blind

In 2015, a black software developer embarrassed Google by tweeting that the company's Photos service had labeled photos…

www.wired.com

Another problem that can occur is the unintentional use of background information in an image. Vicente Ordóñez from the University of Virginia found that his image recognition AI was more likely to guess that someone was a woman if the photo was taken in a kitchen. Combine both of these, and MIT’s Joy Buolamwini found in February 2018 that while IBM, Microsoft and Megvii could correctly identify the gender of a white man 99% of the time, they could only correctly identify dark-skinned women 35% of the time.

Gender Shades

Intersectional Accuracy Differences in Gender Classification

gendershades.org

What about the defaulting on a loan use case? When training such a system, a bank would train the algorithm by passing in past loans and getting it to learn what a good loan and a bad loan looks like. Tricky things like pictures don’t come in to that. It uses cold hard facts. But if the bank was biased in the past when making loans, the AI is going to share the same bias. Imagine that the bank mainly made loans to people over the age of 40 in the past, the algorithm is going to learn that good borrowers are over the age of 40. The algorithm will have an age bias, because that’s what we have trained it on. And until as recently as 1980, UK banks could refuse a loan to a woman unless they had a male guarantor. Without past data of those loans happening, the AI would not recognise it as a valid option

Surely we can easily fix that bias. We just don’t tell the algorithm how old or what gender the applicant is. Unfortunately, today’s machine learning algorithms are amazing at picking hints out of data. Are you called Norman? That peaked in the 1920s, and has been in steady decline since. OK, we’ll remove that. Perhaps you have an aol.com e-mail address. Oops — that’s a give away. How many jobs have you had, do you own or rent, do you have a car, do you play golf. All things which indicate a stronger probability of one age bracket over another.

Amazon recently tried to use AI to improve its recruitment process. The idea was to automate the résumé screening by having the AI look at past successful hires, and pick new candidates who looked like that. They fed it 10 years of data, and found it consistently picked men over women. Names were removed, sports and clubs hidden. When those went, things like language were picked on by the AI — men use words like “executed” and “captured” more often than women. Amazon eventually abandoned the project, never using it to hire anyone.

Amazon Killed an AI Recruitment System Because It Couldn't Stop the Tool from Discriminating…

Amazon reportedly had to scrap a sexist machine-learning-based recruitment system because it favored male candidates.

fortune.com

Fairness

Bias alone in AI is not necessarily a problem. I have a natural bias against entering rooms with hungry lions in them, but most people would consider that perfectly reasonable. I also have a bias against hiring kindergarteners to sweep chimneys, and a bias against sending money to Nigerian princes who send me unsolicited e-mails. These are biases which most people would see as reasonable. They are not unfair — there is a non-trivial risk to me in all of those cases. The problem with bias in AI comes when we consider the bias to be unfair.

The problem for AI is that fairness is a human construct. We have decided a list of things that it is unfair to discriminate on (gender, rage, age, religion), but there are still things which it is considered fair to discriminate on, such as intelligence and criminality. How is the machine meant to know the difference? The simple answer is, we need to focus less on the shiny new algorithms (nature), and focus more on the way we train our systems (nurture).