Data Science for the LGBTQIA+ community?
Data collection does not really have a great reputation in marginalized communities. We often hear news about surveillance of immigrants, communities of color, and social justice activists, all in the name of security. It’s would not be surprising for there to be distrust about collection of personal data, since it likely leads to increased policing.
Aside from this obvious misuse of data, there have been a few stories out there that show that indeed, machine learning algorithms can be racist and sexist. There is an ongoing discussion about the ethics followed in data science. However, I severely doubt that most data scientists have a sufficient analysis of race, gender, etc, and that algorithms will continue to be biased to more privileged individuals. Call me a technological pessimist, if you will.
In the past 5–10 years, the field of data science has exploded. As we enter a new age where everything essentially collects data from you, companies are seeking to capitalize on ways to use that data. Data scientists are equivalent to classical statisticians from several decades ago. We do statistics for sure, but we also spend a lot of time programming and munging so called “big data”. It’s a pretty glamorous job. I kid.
One thing I spend some time thinking about is how I can use a random forest algorithm to advance our fight for queer justice and liberation (note: the joke in data science is that we throw random forests at everything and call it a day).
This leads to some questions that may be ripe for data analysis. Here are some examples that predictive analytics can be used for social justice:
- LGBTQ individuals are simply more likely to be incarcerated. Moreso if you are a person of color. You can think of several key factors that may contribute to this, but which factors contribute the most, and which the least? This would give us a sense of how our community could re-structure itself to prevent our members from being incarcerated. For example, if you were able to obtain demographic data, census related data, education related data, health related data on queer individuals who are incarcerated, and compare them to those who have not, is there any interesting trends that could be informative for would-be policy makers and non-profits? What about factors surrounding recidivism? Are there data driven approaches that could prove useful, which in turn, could yield practical (and economically feasible) solutions to this problem in our community?
- LGBTQ individuals face more health disparities compared to their heterosexual, cisgender peers. You can think of a significant chunk of social factors that influence disparities in our community. These include things like: ability to afford health insurance, experience being discriminated by health providers, distance and feasibility to get to a health center, exposure to educational programs for queer relevant health issues, occupation and/or employment status, being homeless or not, age, race, gender, etc. Specific questions range from improving access to HIV education and prevention programs, medical access to hormones for transgender individuals, improve queer health for our LGBTQ elders.
The difficulty in answering any of these questions is the ability to obtain data from our community. Fortunately, it is my suspicion that non-profits across the country have their own internal databases. And in this day and age where data is valued, I suspect these non-profits are also seemingly getting into the big-data game to help guide their policy work. However, a search for “queer” and a “data science” yields very little. I have no doubt there are data scientists that are LGBTQ out there, but it seems like there is little interest in applying it to the social good of our community.
That isn’t to say that data scientists are not about improving society using the power of machine learning. In fact, there are whole programs devoted to it. I guess my point is are too many data scientists concerned about getting cushy jobs at all the Facebooks, Googles, and Amazons in the world, and not enough really focusing their technical expertise on matters that affect our society as a whole.
So to all you activist queers working in organizations with data: How do you want to see it used? And what questions would you hope to have answered?