Fairness and Bias

Go Ahead, Change My (AI) Mind

Agency as a missing ingredient in the AI Fairness debate.

Vincent Vanhoucke
Towards Data Science
4 min readFeb 9, 2021

--

Robot — Photo by Possessed Photography on Unsplash

Have you ever tried to change someone’s mind? Of course you have. After all, it’s fundamentally what much of communication is about. It’s also incredibly difficult.

It’s even harder when it comes to the deep-seated political, religious, or social beliefs that have shaped us, because they tend to be tied to our own identity. Many beliefs are even completely unconscious and can only be elicited through indirect means. There is no telling if the job application I’m handing out will be treated differently because of my race or gender since the person I hand it to may not even be aware of their own biases. And if they are, they can easily conceal them. Changing someone’s mind in a consistent, enduring way takes years and dedication, particularly when a belief is grounded in someone’s emotional experiences.

Not so with AI systems. For all their flaws and biases, one thing that distinguishes AI systems from human decision systems is that it is comparatively easy to lay bare their biases and change their mind.

If you wanted to uncover biases in someone’s evaluation of job applicants, the right approach would probably be to give them thousands of applications to review and measure the statistical difference in outcomes based on the attributes you seek, controlling for all others. While this is routinely done at the group level when there are enough applicants going through a standardized process, it is impossible to do this for any single decision maker, which makes it very hard to intervene, except at the policy level. You just can’t have one person review thousands of applications and get statistically significant results. You can’t ask a decision maker about their biases, because they may not be conscious of them, or they may have a vested interest in concealing them.

AI systems, on the other hand, can handle being fed thousands or millions of test samples. In fact it’s this very scalability that makes them the subject of so much scrutiny: it is easy to poke at them, and any flaws can generally be uncovered with acceptable p-values by increasing the sample size. It’s a good thing too, because it enables us to deeply examine systems that have the potential to have a tremendous impact on people’s lives in various ways.

But this availability bias is not often recognized: since similar bias studies are extremely hard to conduct on people, they’re simply rarely performed. And much of the narrative we hear around AI biases exists precisely because that’s what we can measure, while the vast majority of decision systems that affect the world today are human, opaque and resistant to statistical scrutiny.

And when we do find a bias in an AI system, we can change their mind on the spot. We have control over the data and facts they’re exposed to in order to build their knowledge base. As a model designer, we can modify which inductive biases we encode into it, and the cost function that it attempts to optimize. And we’re getting better and better as a community at improving our understanding of the relevant knobs we have at our disposal. For instance, studies have shown that with very few grounding statements, you can affect the entire ‘belief system’ of a very large neural model. You can also finely tune the tradeoffs an ML system makes to ensure equitable outcomes at no cost in performance. This is remarkable, particularly when you contrast it with how nearly impossible it is to have any agency on our largely unconscious human cognitive biases.

There is also an intrinsic fairness in the consistency of outcomes. Decisions that change for no fundamental reason are inherently unfair, which is why automated decision systems don’t necessarily need to perform better than human counterparts to provide more fair outcomes, as long as they reduce the variance in outcomes. AIs don’t suffer from decision fatigue.

Auditability, Controllability, and Consistency deserve a central stage in evaluating the fairness and ethical implications of machine learning.

Some would argue that the bar here is not human decision systems, but rule-based systems which are also arguably consistent, controllable and auditable. But therein lies another trap: worse is never more fair. A system which performs worse at its task, even if simpler and easier to reason about, is generally disproportionally worse in terms of fairness of outcomes. I used to work on automatic speech recognition, and in the early days those systems didn’t work well at recognizing spoken English. But it wasn’t the US-born middle-aged white male in their perfectly silent office setting that was most affected. It was the young and elderly, the non-native speakers, people in challenging, noisy acoustic environments. Arguing for coarser, simpler models in the name of transparency and inspectability often ignores the simple fact that a better decision system generally positively impacts the ‘long tail’ dramatically more than the dominant modes of the distribution. In fact, for all the concerns being raised around using ever-larger and more intricate language models, there is strong evidence that the better they get, the more aligned they become with shared human values.

One could actually argue that the work being done on the fundamental performance of AI systems has had a much more profound impact on fairness outcomes than much of the literature that directly targets AI fairness, though if there wasn’t the substantial body of work there is today on this topic, we would not be in a position to even evaluate that impact. More of our collective work in this space should increasingly become less targeted at merely exposing their flaws, and more at exploiting this agency we have over AI systems, and our unprecedented ability to change their minds towards fairer and more equitable outcomes.

(With thanks to Ed H. Chi for his invaluable feedback on this article.)

--

--

I am a Distinguished Scientist at Google, working on Machine Learning and Robotics.