Reading List

“We Need to Know Our Data”

For Pride 2021, a selection of stories on the intersection of data, bias, and LGBTQIA+ communities

Ben Huberman
Towards Data Science
4 min readJun 23, 2021

--

Photo by George Kedenburg III on Unsplash

There’s a widespread—and understandable—unease about attempts to measure or quantify traits that people have historically associated with marginalized groups. This iffy (or even icky) feeling goes back to the very origins of modern data science, and attempts by 19th-century pseudosciences like phrenology and physiognomy to detect and catalog deviance from social norms; at the time, these included not just criminal behavior, but also mental illness and non-heteronormative sexual orientation (the three categories, one should note, stayed lumped together well into the 20th century). Even much more recently, as Michelle Jane Tat noted in a TDS post a couple of years ago, “data collection does not really have a great reputation in marginalized communities.” Attempts to, say, leverage AI into some sort of turbocharged gaydar don’t inspire a whole lot of confidence in our collective guardrails against intrusive data-powered technology.

There’s a flip side, though, to data’s ability to perpetuate bias and reinforce marginalization: its potential to empower, enlighten, and inform both individuals and institutions. On a social level, it’s all but impossible to tackle social injustice and discrimination without seeing its concrete, real-world effects; on a more granular level—this has also been my personal experience—there are real benefits in seeing one’s own lived experience reflected across broader groups; in some ways, it’s a prerequisite to a sense of community.

To mark and celebrate Pride 2021, I wanted to share some of the insightful and important work TDS has published over the years on the intersection of data science and LGBTQIA+ communities. They range from expansive data explorations to personal reflections, and they tackle head-on questions around bias and our ability to bring about positive change. They’re all very much worth your time—this month, and beyond.

Towards Trans-Inclusive AI

Zachary Hay’s post is centered around a clear and powerful premise:

The use of gender as the output of AI algorithms is unscientific because gender identity is under the authority of the individual. It is not something that can be determined by looking at bodies or outward expressions.

From there, Zachary enumerates the multiple ways that companies and organizations are currently using AI and how these practices are harmful in their insistence on a stable, easily detectible gender binary. What I appreciate about Zachary’s post, though, is that it also aims to trace the contours of potential solutions—from insisting on more transparency when it comes to AI use (notably in automated-decision systems in public agencies) to the proactive inclusion of trans knowledge and experience in the design of AI tools.

Keeping Data Inclusivity without Diluting Your Results

Even when data scientists wish to address bias in their projects and design them in an equitable way, a tension often arises between the need to keep results statistically meaningful and the desire for an inclusive dataset. Heather Krause thinks through the issue with a case study: a survey that touches on sexual orientation, but only receives a small number of responses from certain subgroups.

Google Knew I Was Non-Binary Before I Did

It’s funny how inherently sexist algorithms can reveal truths about non-conforming people like me. And how that can be a source of relief and freedom for us.

Ari Joury, PhD’s meditation on Google’s ads settings and how they reflected their personal journey as a non-binary person blends together essay, analysis, and memoir. It’s a powerful read, and pushes the boundaries not just of coming-out stories, but of data-science storytelling, too.

Data Activism and the Fight for Social Justice in Scotland

Dr Kevin Guyan frames his post around a very specific place—it starts in a police interview room in Edinburgh, and goes on to advocate for social justice and equity in Scotland. But the underlying message of his post applies far more broadly, and serves as a great rallying cry and closing note for this selection:

To justify the existence of a social or cultural group that exclusively works with marginalised communities, we need to know our data. Data is ammunition. Data is power.

Thank you for reading—if you’d like to share your work (or someone else’s) at the nexus of data science, AI, and LGBTQIA+ topics, please drop a link in the comments.

--

--

Editor in Chief, Towards Data Science. Previously: Editorial lead, Automattic & Senior Editor, Longreads. (he/him)