The world’s leading publication for data science, AI, and ML professionals.

An “Unbiased” Guide to Bias in AI

Ethical vs. statistical bias in AI/ML models

A conceptual overview

Whenever there is any mention of ethics in the context of AI, the topic of Bias & fairness often follows.

Similarly, whenever there is any mention of training and testing machine learning models, the trade-off between bias & variance features heavily.

But do these two mentions of bias refer to the same thing? Well to some extent, but not quite…

Some basic definitions..

Before I go on to explain, I suspect most of the readers of this blog already have at least a basic understanding of machine learning and related concepts, so I will only go through some key definitions mainly for reference:

Machine learning: The process whereby computers (i.e. machines) use algorithms to "learn" patterns from data, without the need for humans to explicitly define which specific patterns to learn – Author

In order for machines to learn these patterns, especially in "supervised learning", they go through a training process whereby an algorithm extracts patterns from a training dataset, typically in an iterative manner. It then tests its predictions on an unseen (out-of-sample) test dataset to validate if the patterns it had learnt from the training dataset are valid.

What’s bias in layman terms?

The textbook definition of bias per the Cambridge Dictionary is the following:

Bias: The action of supporting or opposing a particular person or thing in an unfair way, because of allowing personal opinions to influence your judgment.

A relatable example is that of sports fans who support a particular team, and due to their bias always predict that their team will win every match, even if in reality their win rate may be less than ~50%, which clearly highlights their disillusion (a.k.a bias)!

What is statistical bias?

In training machine learning models, there is a trade-off between bias and variance, whereby bias illustrates how well a model has captured the patterns in the data, and variance illustrates how well those patterns apply to different cuts of data (i.e. training vs. test vs. validation).

  • Underfitting: A model with high bias is said to underfit the training dataset, i.e. it has not learned adequate patterns that capture relevant relationships between the input and the output/target. In such cases, the variance of how the model performs on training versus test/validation datasets tends to be low i.e. performs equally badly across all datasets.
  • Overfitting: A model with low bias is said to overfit the training dataset, i.e. it has learned too many granular patterns in the training dataset, which enables it to perform extremely well when tested against the training dataset, but very poorly when tested against the test/validation datasets, thereby having high variance.

Both underfitting and overfitting result in poor performance when deployed to production and exhibit what is referred to here as statistical bias. Statistical bias can be summarised as the average error in a model’s predictions versus reality (i.e. the correct output that the model is trying to predict).

What are the root causes of statistical bias?

The main root cause of statistical bias is often because the patterns that the model has learnt through the training process are not reflective of the real relationships between the input data and the target/output. As a result, the model either needs more optimisation and training or additional/better data and/or features to learn more relevant patterns from.

Now imagine we are training an ML model to predict the likelihood of a football team winning a match. To do so, we have access to 20 years of historical head-to-head results and are using that to predict the score of a given match. This model is likely to be highly biased as we all know that past results are not always a good indicator of future performance (especially in the case of Manchester United…!).

In this case, additional predictive features could be included to reduce the bias of the model, such as: fitness of players, availability of star player(s), the formation in use, relative experience of players and coach, in-game statistics such as the amount of possession, number of passes, number of yellow/red cards, recent results, amongst others.

So what about ethical bias?

Statistical bias and ethical bias can be considered as independent. You can have a model that is near perfect in terms of statistical bias (i.e. has low bias/average error) but can exhibit hugely concerning ethical bias. Ethical bias can be defined as bias that leads to unethical (e.g. illegal, unfair or immoral) outcomes, often disadvantaging a particular group of individuals’ rights.

Statistical Bias vs Ethical Bias – cheat sheet

I am a big fan of the so-called "one-pagers" and have created the following cheat sheet that captures the essence of this blog in a condensed manner. It can be a useful tool to assess potential root causes of both statistical and ethical bias in your AI/ML models, along with a non-exhaustive list of possible mitigations to address them:

The first half of the above table covers different types of "statistical bias" in AI/ML models, most of which are relatively well-established in the data science community. Underfitting and Overfitting can be thought of as the main symptoms that help detect statistical bias in AI/ML models and the other 7 items can be thought of as the root causes leading to such statistical bias. The 7 root causes are: 1) Having an unrepresentative sample used for training, 2) Having unbalanced classes, 3) Lack of sufficient data and/or predictive features, 4) Ineffective algorithms and/or hyper-parameters, 5) reinforced bias through ineffective dynamic retraining of models and/or "biased" reward functions in reinforcement learning agents, 6) inconsistent labeling or mislabeling of data used for training, and 7) inconsistent/different quality of data used for training versus data used in production (i.e. measurement bias).

The second half of the table covers potential root causes of "ethical bias" in AI/ML models and possible mitigations, which is a less-established topic in the data science community and is still evolving. I’ll therefore dive a bit deeper into each of the potential root causes of ethical bias in the next section with the help of "real" examples/use-cases.

Ethical and statistical bias explained through real examples

Let’s consider an example that is a favourite in the financial services industry, which is that of a bank developing an ML model to predict the credit worthiness of individuals in order to help them decide whether or not to give out a loan.

Let’s assume that this model is fed with historical data related to household income of individuals as input, and the target to predict is whether they were able to successfully pay back their loan in full.

In this situation, it is highly plausible that the model learns a pattern whereby the higher the income of an individual, the higher the likelihood of a full payback. Whilst this pattern may be correct in many cases, it will likely have high statistical bias (i.e. high average error in its predictions) as it fails to consider other important factors such as cost of living, number of dependents, industry and type of job, amongst others.

Ethical bias through inclusion of protected characteristics

Let’s continue with the same credit-worthiness prediction scenario. After realising this statistical bias, the bank feeds additional data to the model, including some of the factors highlighted above (e.g. cost of living, etc) as well as some personal data such as gender and ethnicity. This results in the model producing near perfect predictions with low statistical bias. However, the same model could now exhibit high ethical bias due to the inclusion of personal data.

Statistically speaking, it may be "true" that for example certain ethnicities have historically been more successful on average in paying back their loans compared to others, however, as true as that may be, in most countries it is 1) illegal (based on equality acts such as EU’s European Equal Treatment in Goods and Services Directive) and 2) immoral, **** to take a person’s ethnicity into consideration when making a decision about their credit worthiness.

This is a good example of how a model can have low statistical bias but exhibit high ethical bias at the same time.

Ethical bias by proxy

Even if features such as gender or ethnicity are not explicitly included in the input data, it is possible that the model somehow learns them via what is referred to as proxy features. Proxy features are characteristics that are somewhat correlated to the personal characteristics in question, such as certain jobs where men are more heavily represented in (e.g. builders, pilots, etc), or certain postcodes that are more popular with specific ethnic groups.

Ethical bias through historical biased decisions

There is another way ethical bias can creep into machine learning models and that is when historical biased decisions are used in the training data against which the model tries to optimise itself.

In the same credit worthiness scenario, even if you remove all personal features like gender and ethnicity from the input data, if the output of the model is based on previous decisions that human loan officers made, another form of ethical bias could remain due to any gender or ethical bias that certain officers may have exhibited. If these biased decisions were somewhat systematic (e.g. have occurred more than a handful of times), ML models can easily pick up on them and learn the same biased patterns.

Ethical bias through an inappropriate choice of target

If the target that a model is trying to optimise itself against is not selected appropriately, it can lead to ethical bias. For example, in a stock portfolio selection model, setting a target that purely seeks to increase profit without any regard for the companies’ environmental or sustainability impacts or any other ethical/legal factors, can result in an unethical portfolio selection.

Similarly, in the credit-worthiness example, having a target that purely seeks to maximise profit may lead to increasing the interest rate for certain underserved communities, thereby putting them under yet more hardship and reinforcing any societal biases that had led to their deprived situation to start with.

Ethical bias in language models

Language models are typically trained on large text corpora and are therefore susceptible to learn any inappropriate language or unethical viewpoints that the text might contain. For example, inclusion of racist language or profanities in unstructured text can lead to the model learning such patterns and cause problems when applied in a chatbot, text generation, or other similar contexts.

One way to mitigate this risk is to remove any unethical portions from the corpus to avoid the model learning questionable patterns/language. One could also remove profanities from text to ensure the training data is more "clean". On the other hand, complete exclusion of such profanities can also negatively impact the model as there may be a need to identify and remove hateful speech/profanities, which would not be possible if the model doesn’t come across this data during its training.

How can we assess/test if a model exhibits ethical bias?

A good way to assess if a model is exhibiting signs of ethical bias, is to perform predictive parity tests. In simple terms, predictive parity checks whether the distribution of predictions are equivalent for the subgroups in question (e.g. gender, ethnicity, etc.).

There are different types of predictive parity tests, such as bias preserving: e.g. in a CV screening model, even though there may be more men than women in our data distribution, the rate of accepted CVs should be similar across both genders; and bias transforming: e.g. irrespective of the skewed distribution of men vs women, the aim is to achieve an equal number of acceptances across both genders.

For example, to test the credit worthiness model for ethical bias, one could include the gender or ethnicity of individuals after the model is trained, in order to check whether its predictions are skewed towards one of the genders or ethnicities. If anomalies are present, then explainability techniques such as SHAPLEY values can help to identify which feature(s) are acting as the potential proxies for protected characteristics and take adequate action.

It should however be noted that sometimes there is not much one can do other than accepting the bias. For example, there may be a pattern whereby people in a certain job family are assessed as more creditworthy than others, however, due to the higher distribution of women in that field, the parity test highlights a positive bias towards females. Such situations may sometimes need to be accepted but are still very important to understand and make note of.

The details of these bias tests go beyond the scope of this blog but one of the best sources would be to review the work of Professor Sandra Wachter from University of Oxford on this topic (see "Further reading" at the end for related links).

Why is ethical bias considered a bigger "risk" in machine learning models?

One could argue that rule-based models are equally susceptible to ethical bias as they can also include rules pertaining to individuals’ race, gender, and other protected characteristics. However, the main difference is that in machine learning, 1) sometimes those features are included by mistake or exist only by proxy, and 2) humans have no direct input into the specific patterns that a model learns (of course other than choosing the training data, engineering features, selecting algorithms, tuning hyper-parameters, etc), therefore, the model could unintentionally learn biased patterns that remain hidden until parity or other similar tests are performed to uncover them.

On the other hand, in the case of rule-based systems, all those rules need to be explicitly defined by someone, and unless there is specific malice or ignorance involved on the part of the developer, it is harder for those unethical patterns to "creep in", so to speak.

In conclusion…

The topic of bias is a hugely important and growing subfield of AI that needs to be front of mind for every data science professional. It is still a relatively new and evolving topic and it’s important for the industry to align on a common set of definitions and terminology from the outset.

As proposed in this blog, statistical and ethical bias are two different categories of bias with distinct root causes and mitigations (see Table 1 for a summary).

Most seasoned data scientists will already have a good grasp of managing statistical bias as it relates to the well-established trade-off between bias vs. variance in ML, however, more awareness is required when it comes to managing ethical bias in AI/ML applications, especially given its potential risk of unwillingly discriminating the basic human rights to equality and privacy, amongst others.

Further reading…

Here are a few suggestions for those who are interested to read more on this topic:


What are some examples of bias in AI/ML models that you have come across? Look forward to your comments!


Related Articles