An ethical code can’t be about ethics

This is why I can’t support Data for Democracy’s data science code of ethics.

Published in

Towards Data Science

5 min readFeb 7, 2018

Stotting (a costly signal): the jumping up and down proves to a predator that the animal isn’t an easy target

[EDIT: I’ve tried to put together a more comprehensive take on the subject of tech ethics here: https://hackernoon.com/can-we-be-honest-about-ethics-ecf5840b6e07]

Last week, I wrote about my skepticism of Data for Democracy’s intent to create a data science code of ethics. My concerns focused on the practical feasibility of the project. After a lot of talking about, reading, and watching the evolution of the D4D code of ethics, I still believe the proposed principles are largely unactionable. I also believe, now, that what the working groups have produced is built on the wrong foundation entirely. This isn’t about iterating forward to a solution. No amount of revision can succeed if you’re building the wrong thing.

We need to be clear on what a code of ethics means. If we can realistically expect everyone in the community to just adopt a code of ethics because they intuitively feel that it’s the good and right thing to do, then the code of ethics is unnecessary — it amounts to nothing more than virtue signaling. If we can’t realistically expect complete organic adoption, then the code is a mechanism to coerce those who disagree with it, to censure people who don’t abide by it. Those two routes — wholesale freewill adoption or coercion — are the only two ways a code of ethics can actually mean anything.

By what right does any subset of people in a profession declare what is right or wrong for everyone in that profession? By no right at all. No one has the right to dictate morality to others, but sometimes some people can obtain enough power to do so. And under one very specific set of circumstances, a profession as a whole can benefit from that happening.

The more I think about it, the more I’m impressed with the Hippocratic Oath. The medical profession, as I see it, is based on dual premises: a person who is unhealthy should, on average over their course of their interactions with the doctor, become more healthy; a person who is healthy should, again on average over the course of their interactions with the doctor, not become less healthy. That is how a doctor makes a living: a doctor gains personally from those two premises being true for most people.

Any practice that would allow an individual doctor to gain personally from being a doctor without those two premises remaining true would damage everyone’s ability to trust doctors. The title of “doctor”, or a medical degree, or other trappings of the medical profession mean nothing if a doctor can fail to heal the sick and still remain prosperous.

Every component of the Hippocratic Oath requires doctors to deliberately limit their own prosperity, either by benefiting others at their own expense, or by refusing to benefit from situations that normally would be fair game. That’s the key. The Hippocratic Oath is an effective ethical code exactly because it’s not a statement of right and wrong. It’s a roster of costly signals.

Costly signaling is the practice of deliberately putting yourself at a disadvantage in order to show you can afford the sacrifice. Some gazelles expend energy jumping up and down to show predators that they have so much energy that they can waste it, proving they are too strong and fast for the predator to bother with them. People in religious communities sacrifice resources to prove that they get enough from the community to make up for the loss. Costly signaling is a way to put skin in the game, a concept most thoroughly fleshed out by Nassim Taleb: “anyone involved in an action which can possibly generate harm for others, even probabilistically, should be required to be exposed to some damage.” If someone engages in costly signaling, it means they actually have an abundance of something — skills, knowledge, resources, strength, intelligence, commitment — that allows them to shoulder that cost.

Those are the people you want to hire. Those are the people who have shown that they have enough competency that they are willing to take a hit in other areas. For a doctor, that means they have a good enough track record of making the sick healthy and keeping the healthy well that they don’t need to conserve their time by refusing to teach, or accrue side benefits from their relationships with their clients. Charlatans only engage in signalling that doesn’t cost them anything. Costly signals weed out charlatans.

How does that translate to data science? People in a business, over the course of their interactions with a data scientist, should…what? Make better decisions? That’s so general as to be meaningless. Make more profits? There’s a huge chain of causation between data science and profits — there are too many ways, some of them legitimate, to explain away failure to perform.

People in a business, on average over the course of their interactions with a data scientist, should increase the percentage of time and resources devoted to decisions that only humans can make.

That, in my view, is a core purpose of data science. (Not the only one of course, but we can focus on this one for the purposes of the present discussion). There are things people spend their time on that machines can do just as well. There are other things people spend their time on that machines can’t do at all. No one who runs an organization ever runs out of decisions that need to be made. That means a lot of decisions are necessarily made hastily, or shuffled onto unqualified subordinates, or forgotten until they fester long enough to be a real problem. People end up paying attention to things they shouldn’t, simply because they can’t sift through competing demands for their attention fast enough.

The promise of data science is that many of the decisions currently made by humans can be made as well or better by algorithms. “Better” can mean more accurate, more efficient, more nuanced, more cost-effective, etc. Automating decisions is nothing more than an academic exercise until that automation frees up people to do other things. Freedom to make decisions is data science’s version of patient health — consistently maintaining and growing that freedom needs to be the basis for an individual data scientist’s prosperity.

Given that an ethical code necessarily has to be imposed upon the many by the relatively few, the only way to ethically define an ethical code is to stipulate, not morality, but skin in the game. That drastically narrows down the scope of what should go into an ethical code, as an enumeration of costs will always be smaller than an enumeration of beliefs. An ethical code that does more than define skin in the game actually undermines ethical behavior by providing ways for people to virtue signal regardless of competency — making it harder, not easier, to ensure a high ethical standard within a profession.

Data scientists don’t need a list of ways to be virtuous. They need a list of ways to prove they aren’t charlatans. That will do more to ensure the health and trustworthiness of the profession than anything else.

Data for Democracy obviously isn’t at the end of its efforts to create a code of ethics, but the working documents they have in place right now and the conversations I’ve had clearly indicate that they are focused on enumerating virtues instead of costs. It’s the wrong basis upon which to build an ethical system.

An ethical code can’t be about ethics

This is why I can’t support Data for Democracy’s data science code of ethics.

Published in Towards Data Science

Written by Schaun Wheeler

Responses (5)