Correcting and Preventing Unfairness in Machine Learning

Pre-processing, in-processing and post-processing methods and non-quantitative approaches

Published in

Towards Data Science

14 min readJun 6, 2022

Fairness in machine learning is a complicated issue. To make matters worse, the people responsible for building models do not necessarily have the skills to ensure they are fair. This is because the reasons for unfairness go beyond data and algorithms. This means solutions will also need to go beyond quantitative approaches.

To understand this we will start by discussing different quantitative approaches. We can divide these into pre-processing, in-processing and post-processing. We will focus on the limitations to understand why they may not be able to address unfairness. Ultimately, we need to consider fairness as a wider problem.

This is why we will move on to discuss non-quantitative approaches. They include not using or limiting the use of ML. Providing interpretations, explanations and the opportunity to challenge decisions is an important aspect. They also include addressing the root cause of unfairness, awareness of the problem and team diversity. To effectively address unfairness it will take a combination of quantitative and non-quantitative approaches.

Quantitative approaches

Quantitative approaches to fairness are where data scientists come in. They work by making adjustments to data used to train a model or to the algorithm that this trained. Looking at Figure 1, you can see that we can divide these into pre-processing, in-processing and post-processing approaches. This division depends on what stage during the model development they are applied.

Pre-processing

Unfair models can be a result of bias in data. Pre-processing methods try to remove bias in data before it is used to train models. The data is transformed and an algorithm is trained just as it would be on the original dataset. The hope is that the resulting model will be fair. That is even if it is less accurate.

One reason for an unfair model is that the target variable reflects historically unfair decisions. This is more common for subjective target variables. For example, sexist hiring practices may lead to a higher proportion of women being rejected for job applications. To address this, one pre-processing method is to swap the target variables of previous decisions. That is we relabel unfair hiring decisions to artificially increase the number of hired women in our dataset.

One issue is that changing target variables may not always be possible. Often a target variable is objective. For example, we can’t relabel previous loan defaults as non-defaults. It may also be difficult for some subjective target variables. This is because we may not have all the information available at the time the decision was made. For hiring decisions, we may have CVs. However, we may not have recorded interviews which are a large part of an application.

Another reason for an unfair model is proxy variables. These are model features that are correlated or associated with protected variables. Where a protected variable represents a sensitive characteristic like race or gender. Other approaches look at “repairing” or removing bias from model features. These work by removing the association between the model features and protected variables.

An example of such an approach is disparate impact removal (DIR). Protected variables are typically divided into a privileged (e.g. male) and unprivileged (e.g. female) group. DIR works by modifying features so the distribution for the two groups becomes similar. For example, see Figure 2. Here the income distributions for males and females have been modified.

Disparate impact removal applied to the income distributions of male and female populations — Figure 2: example of disparate impact removal (source: author)

DIR maintains the rank order within each of the sub-populations. That is if you were the highest income earner in the privileged population you will remain the highest earner in that population. Only the rank order between the two populations is affected. The result is that we will no longer be able to use income to distinguish the two groups. At the same time, we will retain some of the ability of income to predict the target variable.

DIR also has a tuning parameter which introduces a trade-odd between accuracy and fairness. It allows you to control how much the distributions will shift. In other words, it allows you to remove only part of the association between the model feature and the protected variable.

An advantage of pre-processing methods is they can be used with any algorithm. This is because only the data is modified. A major downside is that the interpretation of our features is no longer clear. We may need to repair feature values for multiple protected variables (e.g. gender, race, country of origin). After so many shifts of a feature’s distribution, it loses its interpretation. Explaining such a model to a non-technical audience will be challenging.

In-processing

Instead of transforming data, we can adjust ML algorithms. These are known as in-processing methods. Typically, models are trained to maximise some measure of accuracy. In-processing methods work by adjusting the objective to also consider fairness. This can be done by changing the cost function to consider fairness or by imposing constraints on model predictions. Models are trained on biased data but the end result is a fair model.

For regression, one approach is to add a penalty parameter to the cost function. This works in a similar way to regularization. For regularization, we penalise parameter values to reduce complexity and overfitting. Now we introduce a penalty parameter that reduces unfairness.

Specifically, the parameter is designed to satisfy a mathematical definition of fairness. For example, see equalised odds in Figure 3. Here, the 0 subscript represents the unprivileged group and 1 represents the privileged group. Under this definition, we require equal true positive rates (TPR) and false positive rates (FPR).

Equalized odds defintion of fairness — Figure 3: definition of equalized odds (source: author)

Equalised odds is just one potential definition of fairness. Other definitions include equal opportunity and disparate impact. We discuss all of these in the article below. We also discuss the justification for each definition and show you how you can apply them using Python.

Analysing Fairness in Machine Learning (with Python)

Doing an exploratory fairness analysis and measuring fairness using equal opportunity, equalized odds and disparate…

towardsdatascience.com

A downside to these approaches is they can be difficult to implement in practice. They require adjusting well-established algorithms. Unlike preprocessing methods, these adjustments are also algorithm specific. The way we introduce penalty parameters will be different for regression, tree methods or neural networks.

There are also issues with trying to define fairness mathematically. In doing so we can miss the nuances of fairness. This becomes even more difficult when pursuing one definition. If we achieve fairness based on one definition we will not necessarily achieve fairness based on another. Trying to incorporate multiple definitions into a penalty parameter will drastically increase the complexity of an algorithm.

Post-processing

Post-processing methods work by changing the predictions made by a model. We make no adjustments to the data or the algorithm. We just swap certain model predictions (e.g. from positive to negative). The goal is to do this only for predictions that are unfair.

One approach is to have different thresholds for privileged and unprivileged groups. For example, suppose we use logistic regression to predict default on loan applications. We label a probability less than 0.5 as a positive (1). That is they are predicted to not default and we give the customer a loan. Suppose that, using a threshold of 0.5, we find the TPR is significantly lower for females (0) than males (1).

In other words, equal opportunity in Figure 4 is not satisfied. To fix this we can raise the probability threshold for females to 0.6. That is we have different thresholds for males (0.5) and females (0.6). The result would be that more females are accepted for loans leading to a higher TPR. In practice, we can adjust the thresholds to achieve a difference in TPRs that is within some cutoff.

Equal opportunity definition of fairness — Figure 4: Definition of equal opportunity (source: author)

An issue with this approach is it requires us to have information on protected variables at the time of prediction. To decide what threshold to use we will need reliable information about the applicant's sex. In many situations, due to legal or privacy reasons, this type of information is only available during training. We are also faced with similar downsides to in-processing methods. That is we will be trying to achieve fairness based on one definition.

The shortcomings of quantitative approaches

Data scientists tend to focus on these quantitative approaches to fairness. This is where our strengths lie. However, it would be naive to assume that unfairness can be solved through adjusting data and algorithms alone. Fairness is a complicated issue and we need to see it as something that goes beyond our datasets. To use only quantitative approaches is like stitching up a bullet wound without removing the bullet first.

“Unjust and harmful outcomes, as a result, are treated as side effects that can be treated with technical solutions such as “debiasing” datasets rather than problems that have deep roots in the mathematization of ambiguous and contingent issues, historical inequalities, and asymmetrical power hierarchies or unexamined problematic assumptions that infiltrate data practices.“
— Abeba Birhane

This may be the biggest downside of quantitative approaches. By satisfying some mathematical definition of fairness we convince ourselves that we have solved the problem. However, this can miss the nuances of fairness. It also does nothing to address the root cause of unfairness. This is not to say that quantitative approaches can’t be useful. They will be part of the solution. However, to fully address unfairness, we need additional non-quantitative approaches.

Non-quantitative approaches

In this section, we discuss some of these approaches. You can see an overview of these in Figure 5. Most of them require data scientists to step away from the computer and look at fairness with a wider lens. Successfully implementing these approaches can also require non-technical skills.

Awareness of the problem

Data science involves technical work. We look at numbers, code and screens all day. It is easy to forget that the models we build impact real people. To make things worse, some do not even know that models can lead to unfair results. To address unfairness we first need to understand and accept the potential negative consequences of ML.

To start we need to understand the reasons for an unfair model. We’ve mentioned some of these. They include proxy variables, skewed datasets and historical injustice embedded in data. Unfairness can also come from our algorithm choices or how users interact with the models. We discuss these in more depth in the article below.

5 reasons your model is making unfair predictions

Common sources of bias — historical bias, proxy variables, unbalanced datasets, algorithm choices and user interaction

towardsdatascience.com

We then need to do a thorough fairness analysis. This is to understand the extent to which the above reasons are prevalent. It is also important to extend this analysis beyond data and models. A model may reject or accept loan applications. We can quantify these with 1/0s and calculate fairness metrics. However, this hides the true consequences of our models. Rejections could lead businesses to go bankrupt or a student being unable to attend college.

Only once we have a full understanding of the risks of unfairness can we make appropriate decisions to mitigate those risks. Doing so may require a shift in how a data scientist sees their role. It is no longer purely technical but one that impacts real people.

Don’t use ML

We need to accept that ML is not the solution to all of our problems. Even if it can be used it may still lead to unfair results. There are also many unacceptable uses — e.g. predicting criminality using facial recognition. This means the best solution could be to not use ML at all. We would not automate a process. Instead, a human would be responsible for any decisions.

When choosing this route you need to consider all costs. On one side, it may be expensive for a human to make the decisions. On the other hand, a biased model can have significant negative consequences for users. This can lead to a loss of trust and reputation damage. These risks may outweigh any benefits of automating a process. The understanding of these risks will come from the fairness analysis described above.

Limit the use of ML

Chances are that, if you have put in the effort to build a model, you are going to want to use it. So, instead of completely discarding the model, you can limit how it is used. This can involve applying certain manual checks or interventions on decisions made by ML. Ultimately, we would use a model to only partially automate a process.

For example, a model could automatically reject or accept loan applications. Instead, we could introduce a new outcome — referred. In this case, the application could be manually checked by a human. The process of referring an application can be designed to reduce the impact of unfair outcomes. These should be aimed at helping those who are most vulnerable.

Address the root cause

Unfairness in data is a reflection of reality. If we address the true underlying issues we could also solve the problem within our data. This will require a shift in company or government policy. This also means diverting resources away from quantitative approaches. It will require collaboration from the wider organisation or even the country as a whole.

For example, suppose an automated hiring process results in fewer women being hired. We can help address this problem by encouraging more women to apply. The company could achieve this through an ad campaign or by making its environment a better place for women. On a national level, it could be achieved through more investment in STEM education for women.

Spend time understanding models

Model interpretability is important for fairness. Interpretability involves understanding how a model makes predictions. This can be for the model as a whole (i.e global interpretations). This is done by looking at trends across multiple predictions. This allows us to question those trends and decide whether they lead to unfairness.

It also means understanding how the model has made individual predictions (i.e local interpretations). These tell us which model features have contributed the most to a specific prediction. This allows us to decide if the model has led to an unfair outcome for a specific user.

Interpretability requires us to prioritise understanding over performance. We can do this using intrinsically interpretable models like regression or decision trees. These can be interpreted directly by looking at the model parameters. For non-linear models, like xgboost or neural networks, we will need model agnostic approaches. A common approach is SHAP which we introduce in the article below.

Introduction to SHAP with Python

How to create and interpret SHAP plots: waterfall, force, decision, mean SHAP, and beeswarm

towardsdatascience.com

Give explanations

Model predictions can have serious consequences for users. Considering this they have the right to an explanation for those predictions. Explanations can be largely based on local interpretations. That is we can explain which features have contributed the most to the prediction that has impacted the user.

It is important to understand that interpretations and explanations are not the same things. Interpretations are technical. We look at model parameters or SHAP values. Explanations are given to a non-technical audience. This means it is important that they are given in a way that can be understood. We discuss how to do this in the article below.

The Art of Explaining Predictions

How to explain your model in a human-friendly way

towardsdatascience.com

Explanations can also go beyond feature contributions. We can explain the extent to which a decision-making process was automated. We could also explain what data was used in that process and where we got it from. These aspects of explanations can be defined by law or customer demands.

Give opportunity to challenge decisions

Once a user is given an explanation, they can decide whether it is reasonable or not. If they decide it is unreasonable they must be allowed to challenge the decision. Giving users this power over decisions is essential for combating unfairness. It means unfair decisions are far more likely to be challenged and corrected.

This goes back to the point of limiting the use of ML. We need to put procedures in place that will allow at least some of the decisions to be made manually. Like with our loan model example, decisions can be referred to the lender instead of automatically declined. It is important that applicants have control over this process.

We can also think back to our quantitative approaches. Approaches like DIR can increase complexity and negative impact interpretability. This can make it more difficult to give human-friendly explanations. In other words, through impacting explanations some quantitative approaches can have a negative impact on fairness.

Team diversity

Data is the best tool we have for making effective decisions. However, as mentioned, it can hide the true consequences of ML. Our lived experiences can be a better indication. These experiences are associated with our cultural backgrounds. That is how we experience the world and technology depends on our gender, race, religion or country of origin. This makes it important to get feedback on our ML system from a diverse group of people.

It is also important to hire a diverse team. These are the people who will actually be building the model and system. Doing so will bring a diverse set of lived experiences to the table. They will all have an understanding of how the system will impact their own lives. This will make it easier to identify potential fairness issues before a model is deployed.

Diversity also means hiring people with different areas of expertise. As we have mentioned, to address unfairness we need to go beyond quantitative approaches. In other words, we need skills that most data scientists do not have. A key team member would be an expert in AI ethics. They would have a deeper understanding of the non-quantitative approaches we’ve outlined above. Data scientists will then be able to focus on the quantitative approaches.

I hope you found this article helpful! You can support me by becoming one of my referred members. You’ll get access to all the articles on Medium and I’ll get part of your fee.

Join Medium with my referral link — Conor O’Sullivan

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

conorosullyds.medium.com

You can find me on | Twitter | YouTube | Newsletter — sign up for FREE access to a Python SHAP course

Image Sources

All images are my own or obtain from www.flaticon.com. In the case of the latter, I have a “Full license” as defined under their Premium Plan.

References

Birhane, A., (2021) Algorithmic injustice: a relational ethics approach. https://www.sciencedirect.com/science/article/pii/S2666389921000155

Pessach, D. and Shmueli, E., (2020), Algorithmic fairness. https://arxiv.org/abs/2001.09784

Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. and Galstyan, A., (2021), A survey on bias and fairness in machine learning. https://arxiv.org/abs/1908.09635

Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C. and Venkatasubramanian, S., (2015), Certifying and removing disparate impact. https://dl.acm.org/doi/pdf/10.1145/2783258.2783311

Bechavod, Y. and Ligett, K., (2017), Penalizing unfairness in binary classification. https://arxiv.org/abs/1707.00044

Lo Piano, S., (2020). Ethical principles in machine learning and artificial intelligence: cases from the field and possible ways forward. https://www.nature.com/articles/s41599-020-0501-9

Smith, G., (2020). What does “fairness” mean for machine learning systems? https://haas.berkeley.edu/wp-content/uploads/What-is-fairness_-EGAL2.pdf

Google, (2022), How Inclusive Data Builds Stronger Brands https://www.thinkwithgoogle.com/feature/ml-fairness-for-marketers/#what-we-learned

Correcting and Preventing Unfairness in Machine Learning

Pre-processing, in-processing and post-processing methods and non-quantitative approaches

Quantitative approaches

Pre-processing

In-processing

Analysing Fairness in Machine Learning (with Python)

Doing an exploratory fairness analysis and measuring fairness using equal opportunity, equalized odds and disparate…

Post-processing

The shortcomings of quantitative approaches

Non-quantitative approaches

Awareness of the problem

5 reasons your model is making unfair predictions

Common sources of bias — historical bias, proxy variables, unbalanced datasets, algorithm choices and user interaction

Don’t use ML

Limit the use of ML

Address the root cause

Spend time understanding models

Introduction to SHAP with Python

How to create and interpret SHAP plots: waterfall, force, decision, mean SHAP, and beeswarm

Give explanations

The Art of Explaining Predictions

How to explain your model in a human-friendly way

Give opportunity to challenge decisions

Team diversity

Join Medium with my referral link — Conor O’Sullivan

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

Image Sources

References

Written by Conor O'Sullivan