Fairness and Bias

The god trick in data and design: how we’re replicating colonial ideas in tech

Miranda Marcus
Towards Data Science
9 min readNov 17, 2020

--

Photo by NOAA on Unsplash

Originally I trained as a designer before retraining in digital anthropology and falling backwards into working with data. In my experience, data and design are seen as pretty separate disciplines on the whole. While there may be respect between the two disciplines, as with many things arty and sciencey, they are essentially set up in opposition to each other.

User-centered design is currently the predominant design methodology for creating technology. It is efficient in enabling the understanding of everyday human problems, and the creation of empathetic designs that are intuitive and suited to users’ needs. Data science has evolved into a field and that broadly refers to all the different ways to yield value from data, many of which are critical in the identification, creation and monitisation of digital products and services. But while pictures and numbers may feel distinct, the development of consumer networked technologies has led to a common practice combining the two disciplines.

Increasingly, data is used as a design material and design is used as a conduit for data collection and use. The user needs identified through user research, and the designs created to meet them are verified and refined through data-enabled optimisation to create the best possible product. In commercial settings, design is critical in creating a product that is ever easier to use and ever more compelling in order to retain users and in many cases collect data about them which can then be monetised. At first glance, one may seem a yin to the other’s yang, one a balancing of the other. The qual and the quant, the raw and the cooked, the objective and subjective, the mechanised and the human countering each other's limitations to create something more whole and more effective. I’m a strong advocate for interdisciplinary working and I firmly believe that the better we are at thinking through the same question with multiple methods, the better knowledge and things we can create. Melanie Feinberg’s paper ‘A Design Perspective on Data’ goes as far as suggesting that data creation can be understood as a multilayered set of interlocking design activities. Whilst people like Tom Kun has shown that there are many interesting opportunities for design practice presented by data yet to be explored.

But whilst there is much to be gained from combining design and data practice, we must also recognise that they have a shared set of limitations. There is a long, well documented and disturbing history of oppressive data science that leads to the perpetuation of multiple forms of injustice, of which surveillance capitalism is just one of the most recent iterations. Although computing’s awareness of historical and structural inequalities and its repertoire of tools for anti-oppressive data science and technology design have grown in recent years, the pace of society’s datafication and the demand for solutionism driven by it have seemed to move even faster.

To put it simply, both practices perpetuate ‘the god trick’ which places designers and data scientists on a pedestal and separates them from ‘the users’. This artificial separation of observed and observer is a hangover of colonial research practices and through this implicit othering both have the potential to reinforce structural racism and injustices. These risks are amplified when the practices are combined.

Data Settings

Donna Haraway coined the term ‘the god trick’ in her 1988 essay Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective. It refers to the way that ‘universal truths’ seemed to be generated by disembodied scientists who can observe “everything from nowhere”. She suggests that the tendency for scientists to seemingly set themselves outside of the situated, embodied world they observe as their object of research is an ontological move intended to preserve objectivity. Very few scientists would suggest they are actually able to perform this trick, but Harraway was highlighting the way that scientific knowledge is treated as if it is pure, without trace of the vantage point and values of the observer. The situation or context that data is collected in has an inalienable relationship to the nature of the knowledge it can generate.

It is a well-known trope that technology is not neutral — the affordances of how it is conceived, constructed and consumed inform the way it is used and the effect it has on the world around it. This is a basic tenet of material culture theory — the things we make, how we make them, and how we use them reveal the social structures and values of that culture. In Haraway’s words:

“T​e​c​h​n​o​l​o​g​y is not neutral. We’re inside of qué we make, and it’s inside of us. We’re living in a w​o​r​l​d of connections — and it matters which ones get made and unmade.”

The same is true for data. A data set is made, it is collected and crafted like any other object and therefore bears the same traces of its creator. The consequence of the god trick is that those using the data to create and use technology are separated from the context of the data, the people, environments and values that contributed to the creation of the data. What is lost is that every data set has a setting and in the process, the data becomes is associated with objective fact.

Adrian Mackenzie talks about how algorithms are often understood as a “context free grammar” but he points out that this is mystifying because as it does not take into account that the “actual code that programmers read and write… encounters lives, institutions and events”. Basically, there is an inalienable relationship between created and creator, and that goes for people collecting data and people building models with it. Similarly, and maybe more bluntly Paul Kockelman demonstrates the relationship between programmer and programme using the image of a hole. This hole sorts things that are bigger or smaller by virtue of what will fall into it. He outlines how in order for the rule to be applied, a hole must have an “affinity” for the qualities of the objects around it. This affinity does not come from the hole itself, but the person who dug it.

In their book Data Feminism, Catherine D’Ignazio and Lauren Klein present a new way of thinking about data science and data ethics informed by feminist theory. They make the point that data scientists have moved from being referred to as ‘janitors’ who clean data to ‘unicors’, ‘wizzards’, ‘ninjas’ and ‘rockstars’ who all have special magic skills that set them apart from mere mortals. What links all these characterisations is that they’re all consistently thought of as males who work alone to triumph against great odds. In reality, data scientists sit apart from the collection and maintenance of the data and become ‘strangers in the dataset’, cut off from its context.

Being a stranger in the dataset is not an inherently bad thing, but it carries significant risk of what renowned postcolonial scholar Gayatri Spivak calls epistemic violence — the harm that dominant groups like colonial powers wreak by privileging their ways of knowing over local and Indigenous ways.

Graphic by Catherine D’Ignazio. Sourced from www.mediacloud.org searching artciles posted between 2012 — 2018

When this is not acknowledged the technology the data drives inevitably enforces and amplifies the values inherent in the spaces between the data points on those using the tech and beyond. Inevitably, those values rarely take into account those on the margins of the mainstream, meaning vulnerable communities are often negatively affected. This is not an abstract philosophical problem. It is what fuels the practices of data extractivism and data colonialism.

There’s are loads of amazing practical work on ways to bring data practice back down to earth. I have listed some below but would love to hear of more!

  • Data Feminism: brilliant book by By Catherine D’Ignazio and Lauren F. Klein presenting a new way of thinking about data science and data ethics that is informed by the ideas of intersectional feminism.
  • Situated data practice: a paper by Jill Rettberg that proposes a (very cool) new method for analysing social media platforms and digital apps that allows researchers to analyse how data is constructed, framed and processed for different audiences and purposes. As Jill Rettberg says “data is always partial and always situated”.
  • Indigineous AI: a starting place for those who want to design and create AI from an ethical position that centers Indigenous concerns.
  • Transfeminist technologies: a space that provides tools for enabling collective brainstorming on alternative imaginaries surrounding technologies. There is substantial evidence that most of the algorithms that command our digital interactions are biased, particularly towards reproducing what black feminist scholar Patricia Hill Collins calls the matrix of domination (capitalism, heteropatriarchy, white supremacy and settler colonialism). In other words, tech is currently designed in a way it helps maintaining the status quo of social inequality and the norms of consumerist, misogynist, racist, ableist, gender binarial and heteropatriarchal society.

Empathy as an extractive technology

Ideas of colonialism and extractivism might not seem particularly relevant to user-centered design. Many practitioners describe empathy and ensuring that one designs with users as central to the discipline. This seems to be counter to the idea of a stranger in a dataset. But Anthropologist and Design Researcher Sekai Farai makes the very compelling point that despite the empathy rhetoric, many of the practices commonly used in commercial design research are exploitative.

“There is a sympathetic veneer to all forms of exploitation — capitalists call it competition, colonial anthropologists called it curiosity, engineers call it logic, design calls it aesthetics and research calls it empathy… Empathy is an extractive technology.”

Many design research techniques have their roots in anthropology which in turn has its roots in colonial research practices. Think of anthropology and you’re likely to think of the ‘civilised’ white guy in a pith helmet observing ‘savage’ indigenous cultures. This approach was grounded in the principle of observed (emic) and observer (etic). The practice of an anthropologist separating themselves from those they are working with, and consistently holding the power in the room whilst they decide how to represent the ways of the other culture seems pretty akin to the god trick to me. It’s just a different form of data. The difference is that the power imbalance between the guy in the pith helmet and the indigenous communities was explicit.

Modern anthropologists are actively grappling with these roots in order to learn from the harm those practices can cause for the communities being ‘studied’. Anthropology has by no means overcome these problematic roots but there is an active and diverse discourse in the field. This thread from Anthropologist Philip Loring is a fantastic example of that- in it he explains why he and his co-authors retracted an in-press academic article based on a historical dataset of indigenous hunting and fishing in Alaska becuase they had not partnered with Alaska Native organizations, scholars or traditional knowledge holders when designing the study, or doing the analysis.

But the self-reflection of Anthropology does not seem to have translated into commercial contexts for user research. Just like the separation of data wizards from the context of the data, user researchers and designers often propagate a similar distinction between observed and observer; when in a room of potential users the researcher all too often has all the power. They use this power to extract data from those they observe (emic) and organise it into usable insights to inform technology designs (etic). Their practice is all too often one of utelising the civilisation of technology to creating order out of the messy, disordered savagery of the needy user. The heart of what Farai is saying is that empathy does not scale. In a capitalist context, the practices of user research all too often extracts value from ‘subjects’ to create financial value that is disproportionately handed off to the shareholders, not the users. In her words:

“there is no empathy in the capitalist production of technology. What there is though is overservation, separation, exploitation and domination. The industrial conditions of user research extinguish the empathetic instinct. Sometimes outright, but sometimes through our pattern making, frameworking and filtering.”

As with the data community, there are many great communities grappling with these issues. Farai calls on designers and design researchers to close the gap between them and the participants of their users — to create a dialogue and share in the process of learning.

  • Design Justice Network: an international community of people and organisations that are committed to rethinking design processes so that they center people who are too often marginalized by design. They work according to a set of principles that were generated and collaboratively edited by their network.
  • Marion Lean’s work on how materials and arts-based research can help us understand people’s relationship to systems and technology

In short, as with any tools and ways of knowing, the combination of design and data yields huge opportunities for making great stuff. But if we’re going to tackle the systemic issues we face, we need to reconcile with this god trick and bring our technology back down to earth.

--

--

Acting Head BBC News Labs / Wellcome Trust Data For Mental Health Research. ex Open Data Institute. Writes about data, design, digital, and anthropology.