AI Alignment and Safety

Last week, we completed an eye-opening activity in one of my introductory graduate school courses. For some context, the class is designed to provide an introduction to different research paradigms within human-computer interaction (HCI) and related fields. We spent the first half of the quarter discussing the high-level elements of quality research and have recently been discussing methods to gauge the ethics and trustworthiness of scholarly research.
For the activity, our professor had each of us analyze a research paper of choice and write a short 300-word snippet discussing the ethical issues either directly present in or implied from the research. We then compiled all of our articles together into a little "virtual magazine" of sorts, usable as a quick future reference when reading scholarly papers.
The end result was fascinating, in particular because we were able to find a number of ethical concerns still present in actual, published research. Here are just a few of the issues people found:
- Making general claims based on data from a small, unrepresentative sample
- Citing outdated research
- Not respecting the privacy of collected data
- Conducting research on marginalized communities, but then never coming back to actually help those very same communities
- Using unreliable data (e.g. data collected on MTurk without a proper metric for whether or not users were genuinely answering the questions)
Our pseudo-magazine was intentionally general – and looking through it, I couldn’t help but connect all the issues discussed with my own background in computer science. In particular, I considered how many of the above issues are tied to Artificial Intelligence (AI).
AI has been a blossoming field for years now, but lately there has been a growing interest in Ethical AI. This was spurred by the realization that many currently existing forms of artificial intelligence are strongly biased – for example, there is evidence of facial recognition algorithms performing worse on people with darker skin [1].

The issue stems from the data that AI algorithms are trained on. When a model is trained on incomplete or inaccurate data, it doesn’t matter how good it is – its predictive capability will always be poor.
This is where the activity we did in class comes in. In the midst of it, I asked myself a question: can we extend and build upon this strategy in the search for ethical artificial intelligence? If so, where do we start?
I believe the answer is yes, and where we start is to identify a crucial aspect of my class activity, one which made it very unique: the rich diversity of perspectives.
Let me elaborate on that a bit. My graduate program is interdisciplinary and hosts students from several different backgrounds. In this one room, we had students who specialized in computer science, sociology, UI/UX, economics, graphic design, public health, mathematics, and more. The research papers we explored spanned the gamut from information visualization to axiology (a branch of philosophy concerned with what makes things valuable). Because of this, we were able to create a magazine that is simultaneously specific and general – specific in that it lets us effectively explore the ethical issues in contemporary research, and general in that it allows us to do so across a range of different research areas.
This brings me to my main point: incorporating diverse viewpoints in the initial phase of research is essential to creating ethical forms of artificial intelligence. If an AI Technology is biased, it is because the underlying data is also biased. Many times, this can be unintentional, occurring because the researchers simply did not realize the flaw in their data collection technique. If you don’t know about a bias, how can you possibly address it? By involving people from different backgrounds in the creation of these algorithms, we can ensure the final product is ethical and inclusive.
Let’s go back to the example of facial recognition technology. The algorithm mentioned in the article linked above performed poorly on dark-skinned women. And while I cannot make a direct claim about the research team that designed this algorithm, it is well-documented that both women and people of color are underrepresented in STEM fields, especially computer science. Accordingly, it is not unlikely that the team failed to take note of its poor training data on account of its own homogeneity.
To make artificial intelligence ethical, we must start from the beginning. It is not enough to create an algorithm and then contemplate how to make it ethical and inclusive in retrospect. Rather, we should be using novel techniques in the initial phases of research and assimilating multiple perspectives into the very being of our algorithms. And to do this, we need two crucial ingredients:
- Research teams which include researchers from all communities, which will make it easier to ensure the resulting algorithms are not unintentionally discriminatory.
- Research teams which use mixed methods – in other words, in combination with the quantitative researchers who do the mathematical work of creating and programming an algorithm, we need qualitative researchers who can design better data collection and analysis methods which account for the human aspect of AI – not just the machine.
Of course, this is not enough by itself – but it is a start. Artificial intelligence shows no signs of slowing down in the near future. If we do not learn to make it ethical, what awaits us in the remainder of the 21st century will be anything but good.
References
[1] A. Najibi, Racial Discrimination in Face Recognition Technology (2020). https://sitn.hms.harvard.edu/flash/2020/racial-discrimination-in-face-recognition-technology/