Confronting Epistemic Injustice with Humanistic Personalization

Using the ethical foundations of the GDPR to usher in a new paradigm of digital identity and personalization

Travis Greene
Towards Data Science

--

The GDPR is based on a particular European view of the human person and aims to protect essential aspects of personhood in the digital age. Photo by DDP on Unsplash

Machine learning-backed personalized services have become a permanent fixture in our increasingly digital lives. Personalization relies on vast quantities of behavioral big data (BBD): the personal data generated when humans interact with apps, devices, and social networks. BBD are the essential raw material of our digital representations.

As life steadily moves online, the digital representations of persons take on legal and moral importance. Influential European legal theorists and philosophers have even written an Onlife Manifesto, shaping the discourse around what it means to be human in the digital age. At the same time, the IEEE has articulated a vision of Ethically Aligned Design (EAD) that empowers “individuals to curate their identities and manage the ethical implications of their data.”

But what would such a design mean for recommender systems, for instance? And what’s the point of giving people this power? What underlying notions of the human person are driving this kind of thinking? That’s what we want to unpack.

Wilhelm Dilthey and Max Weber believed the human sciences required a fundamentally different methodology from the natural sciences. We propose narrative understanding as a paradigm for giving meaning to our digital experiences and respecting diverse perspectives. Photo by Andriyko Podilnyk on Unsplash

Our Vision: Humanistic Personalization

We introduce our notion of humanistic personalization as a way for thinking about digital identity and personalization that draws on the fundamental ethical values embodied in the EU’s General Data Protection Regulation (GDPR). Humanistic personalization looks first at which capacities make the human person unique and then tries to imagine what recommender systems and personalization would look like if they were to support these capacities.

Our notion of humanistic personalization envisions a shift from an implicit, behavior-based representation paradigm, dominated by our “organismic” interests, to one centered on conscious, explicit and “reflective” feedback through the notion of dialogic narrative construction between data controllers and data subjects. Humanistic personalization is inspired by the philosophical ideas of Kant, Hegel, Habermas, Ricoeur, Derrida and others.

Beyond personalization, a focus on narrative could have wide-ranging consequences for the future of AI/ML. If we are to ever “crash the barrier of meaning in AI,” we will need to also crash through the barrier of narrative. Further, the inherent intelligibility of narrative could be useful in the emerging area of “user-centric” explainable AI, especially where regulations such as the GDPR give data subjects rights to clear, understandable explanations of algorithmic decisions. Lastly, due to its intuitive “explanatory force,” narrative explanation could serve as a interesting lens for new approaches in causal modeling.

Narrative Accuracy and Epistemic Injustice

We offer the concept of narrative accuracy as an orienting design goal for personalization. By maximizing the narrative accuracy of both the personal data used as input to the recommender system and the resulting recommendations themselves, we can reduce the epistemic injustice done to persons via personalization.

Epistemic injustice is multifaceted. It refers to the way we “distribute” credibility to a person’s truth claims in an unfair way that devalues them in their capacity as a knower. It can also mean that we lack the conceptual resources to understand the experience of others. Why focus on narrative accuracy and its complement, epistemic injustice?

Because we believe the concept of narrative to be a crucial feature of human experience worth protecting. At the same time, we reject the Enlightenment ideal of one single, universal “method” for settling questions of truth.

In other words, achieving a completely objective “view from nowhere” is not possible. Instead, knowledge gains in robustness as we fuse diverse input from diverse perspectives. If you’d like to read more details, particularly about the philosophical anthropology of the GDPR, see our working paper Beyond Our Behavior: The GDPR and Humanistic Personalization.

So What is Personalization, Anyway?

Personalization is huge business. Netflix, for example, claims to save $1B per year due to its personalization efforts. Here’s Facebook describing how it uses your personal data to “personalize” your experience on the platform under its new TOS:

Your experience on Facebook is unlike anyone else’s: from the posts, stories, events, ads, and other content you see in News Feed or our video platform to the Pages you follow and other features you might use, such as Trending, Marketplace, and search. We use the data we have — for example, about the connections you make, the choices and settings you select, and what you share and do on and off our Products — to personalize your experience.

Why We Suggest Grounding AI Ethics in the GDPR

Simply put, we believe the GDPR serves double duty as both legal norm and ethical foundation for AI/ML. The main reason is we don’t think it’s productive to add to the current morass of competing principles, guidelines, and frame-works for Ethical AI/ML. One paper alone lists at least 84 examples of “AI Ethics” guidelines, most of which are less than 4 years old. No one can make sense of all this.

Further, tying ethical principles to legal norms via the GDPR is valuable because, in principle, the GDPR applies to any data controller anywhere in the world — processing the personal data of data subjects residing in the EU. Law is institutionally-backed to achieve compliance via the state’s monopoly on the legitimate use of physical force. So ethics (potentially) backed by force is, in our view, much more likely to foster compliance by industry and researchers around the globe.

Informational Self-determination and the Right to Personality

The rights given to data subjects under the GDPR reflect a certain European understanding of the human person. These principles are valuable because they have withstood intense philosophical scrutiny over centuries. From the European perspective, data protection and privacy are tools aimed at preserving human dignity. If you’re really interested in the details, we explore the philosophical thought behind what makes the human life valuable in our paper linked to above.

In any case, two key notions underlie the ethical foundations of the GDPR: informational self-determination and its predecessor, the right to the free development of one’s personality. According to legal scholars Antoinette Rouvroy and Yves Poullet, informational self-determination is defined as “an individual’s control over the data and information produced about him,” and is a precondition for any kind of human self-determination. There is a political dimension to self-determination as well, as a cooperative, democratic society depends on its citizens having the capacity for self-determination. In our paper, we connect these ideas with Jürgen Habermas’ notion of communicative reason.

Our Digital Behavior is Fundamentally Misinterpreted

We assert that it is not the case that personalization is so incredibly accurate but that we, as linguistic, social, and physically-embodied animals, have deceived ourselves as to our potential for free movement and thought.

Digital environments limit and constrain what is humanly possible even further. Perhaps even worse, as degrees of freedom in digital environments are reduced, hermeneutic¹ problems of human action arise and present ethical problems. For one, the behaviorist assumptions behind the collection of implicit data fail to appreciate an important caveat: the meaning of digital behavior is fundamentally under-determined. Because humans are conscious, intentional beings, each action we initiate can be seen from an external/physicalist or an internal/phenomenological perspective.

When complex behaviors (e.g., complete a transaction) are broken down into overly-narrow “sub-symbolic” categories² (e.g., clicks, mouse trajectories, and other “microbehaviors”) by BBD platforms and data scientists, intentions become decoupled from results. What is more, a clear one-to-one mapping of intentions to actions becomes impossible. One cannot intend to do what one cannot first identify.

As psychologists Vallacher and Wegner put it,

As philosophers have long noted, any segment of behavior can be consciously identified in many different ways. Something as simple as ‘meeting someone’… could be identified by anyone with an even mildly active mental life as ‘being social,’ ‘exchanging pleasantries,’ ‘learning about someone new,’ ‘revealing one’s personality,’ or even ‘uttering words.’

So misinterpretation (or lack of complete interpretation) is baked into digital life and is worsened when automated systems can dynamically change digital environments in real-time, such as with reinforcement learning-based recommender systems used by Facebook. We thus face a crisis of interpretation. What to do?

If we follow the GDPR, we let the data subjects themselves decide.

According to philosopher Paul Ricoeur, we are “characters” whose actions derive meaning by their emplotment into a story we tell about ourselves. Photo by Jakob Owens on Unsplash

Connecting Moral and Social Identity via Narrative Identity

Acclaimed psychologist and linguist Michael Tomasello contends that our membership in a linguistic community binds our social and moral identities. The reasons we give for our behaviors are related to our role and status within this community. From a young age, children must make decisions about what to do and which moral and social identities to form. Children make these decisions in ways justifiable both to others in their community and to themselves.

Social and the moral identity are connected through the communicative, reason-giving process to others. Tomasello claims this process became internalized as a form of normative self-governance, making up our moral identities. Our psychical unity requires we do certain things in order to continue to be the persons we are, seen from both the inner perspective (self, private) and outer (other, public). This epistemic gap between inner and outer perspectives on the same event is what drives epistemic injustice, which we discuss below.

Moral and social identities are synchronic (cross-sectional) structures. They are how we represent ourselves to ourselves at particular points in time. But we have not yet explained how these identities evolve over time. For that, we need a diachronic (longitudinal) account of identity.

Narrative Identity: It’s What Gives Your Life Meaning Over Time

According to the psychologist Jerome Bruner, narratives are the instruments through which our minds construct reality. It’s worth pointing out some of their unique features that capture the human experience in all its messy and imperfect glory.

  • Diachronicity: narratives account for sequences of ordered events over human time, not absolute “clock time.”
  • Particularity: narratives are accounts of temporally-ordered events told from the particular embodiment of their narrator(s).
  • Intentional state entailment: within a narrative, reasons are intentional states (beliefs, desires, values, etc.) which act as causes and/or explanations.
  • Hermeneutic composability: gaps exist between the text and the meaning of the text. Meaning arises from under-standing relations of parts to whole.
  • Referentialility: realism in narrative derives from consensus, not from correspondence to some “true” reality.
  • Context sensitivity and negotiation: readers of a text “assimilate it on their own terms” thereby changing themselves in the process. We negotiate meaning via dialogue.

Through the diachronicity of narrative, we unite our moral and social identities over time, giving rise to the uniqueness of persons.

Epistemic injustice arises when the truth claims of individuals are not given proper evidential credit. Photo by Clay Banks on Unsplash

Narrative Accuracy and Epistemic Injustice

Epistemology is the study of knowledge and its foundations. We adapt Miranda Fricker’s concept of epistemic injustice and use it to shine new light on the problem of narrative accuracy in personalization.

Fricker is interested in injustice as it relates to disrespecting someone in her “capacity as a knower.” Epistemic injustice essentially reduces one’s trust in one’s own judgment and ability to make sense of one’s lived experience. There are Kantian and Hegelian aspects to epistemic injustice. Notably, epistemic justice requires a mutual recognition of the perspective and experience of others, particularly those in positions of asymmetrical epistemic power (i.e., data subjects relative to data collectors).

Testimonial Injustice

There are two dimensions of epistemic injustice applicable to the case of data subjects receiving personalized recommendations. First, testimonial injustice might occur when prejudice or bias leads a data collector to give a “deflated level of credibility” to a data subject’s interpretation of a recorded action or event, including a recommendation.

For example, if a data collector only uses non consciously-generated BBD and does not weight explicit feedback, a kind of testimonial injustice has occurred. Another example might be that a BBD platform allows users to “downrate” bad recommendations, but these are not actually factored into changing the recommendations.

From the standpoint of Bayesian model averaging, we can also conceive of testimonial injustice as when uncertainty in model selection (ignoring the subjective “model” of the data subject) is ignored in favor of the pre-defined model of the data collector or processor.

Hermeneutical Injustice

Hermeneutical injustice may arise when a data collector or data collection platform lacks the “interpretive resources” to make sense of the data subject’s lived experience, thereby putting him at a disadvantage. The fundamental question is, what counts as what?

Under one interpretation of an event, we may generate statistical regularities, while under another we may get different statistical regularities which become encoded in the parameters of ML models. It follows there is no one “best’’ representation or encoding of BBD.

There are simply different representations under different interpretations about what counts as what.

Currently, the categories of events recorded by BBD platforms are typically pre-defined by system designers without any input from platform users, for instance. If designers of recommender systems do not consider the diversity and richness of data subjects’ intended actions, values, and goals while using the system, hermeneutical injustice will be unavoidable.

The Future of Humanistic Personalization

Making and sustaining a coherent digital self-narrative is a uniquely human capacity which we cannot leave up to others or outsource to automated agents. This sentiment is shared by the GDPR and IEEE EAD principles. We are characters in the stories we tell about ourselves. We know which events define us, we know which values drive us, we know the causes (reasons) behind our actions. And if we do not, we have the capacity to try to find out.

The corporate owners of BBD collection platforms and data scientists may make claims to the contrary based on their statistical analyses of our observed behaviors, but we believe that rights to informational self-determination trump these assertions.

As postmodernists have pointed out, problems of ethics and interpretation are inseparable. What we believe to be true influences our decisions about what is right. But if meaning is socially constructed, data subjects alone cannot solve these problems. It will take both a community and good faith communication to work out the “rules” of our common language game. Data scientists will need to play a larger role in this dialectic of meaning negotiation and identity formation in the digital sphere. After all, if the original meaning of category is to “publicly accuse,” the data subject, as a member of the public, should play a part in that process.

Skeptics might counter that optimizing for narrative accuracy will require a trade-off in the ability of recommender systems to accurately recommend items and predict specific behaviors. Business profits may also be affected. Nevertheless, the GDPR forces us to ask the question:

Do we ultimately wish to represent ourselves according to the needs and interests of business, or humans?

[1] Hermeneutics was originally about the study of methods of interpretation of biblical texts, but was re-invented as an epistemological method by philosophers in the 20th century.

[2] Originally in Greek, the word “category” meant something like “to publicly accuse.” Notice the role played by social consensus.

--

--