
Imagine you spent countless hours working on a job application just to find out that it has been rejected, not because of the lack of skills but for the fact that the applicant is female.
Yes, this is unfair.
This actually happened at Amazon when they were testing an automated system to select the best candidates from a pool of applicants by using the help of an AI. The AI was trained on the resumes collected over a period of 10 years and most of the resumes in the training data belonged to men. This influenced the AI to have a bias towards men in the selection process. Also, it portrays the sad reality of the tech industry that it is male-dominated. You can read more about this article here.
With the growth in AI development, many complicated and time-taking tasks are being automated with predictive models. Every year, a new set of complex AI architectures are being built and the explainability of the decision-making process of such architectures is decreasing. This adds to the causes for the AIs to be biased or "unfair".
Sensitive Features

AI/ ML models when trained on data containing sensitive features such as gender, race, ethnicity etc would not turn out to be fair models. They would capture the disparities present in the data and represent the same in their predictions. So we are adding bias (gender bias, racial bias etc) to the model by using sensitive features in the training process.
There can exist cases when you do not train the model using any explicit sensitive features and still there would be some bias in the model. It might depend on other factors like data imbalancy. This problem is difficult to fix if the explainability of the model is very less. We need to understand how the data is being used to make a final prediction.
Bias

Bias, in general words, is an underlying assumption using which we make a decision. Humans are inherently biased and this bias enables us to make faster decisions. As AI learns from the data, we learn from our previous experiences and make some mental constructs to arrive at a decision. Some of these biases actually help us in making good decisions and sometimes not so much.
For an AI model, we want to eliminate unwanted biases and make the model fair in making a judgement/ decision.
Some more examples of these biased AIs can be seen in the following articles:
- Study unveils Facebook gender bias in job adswhere Facebook’s AI showed certain ads to a specific gender like showing Domino’s delivery job to the male audience etc.
- Millions of black people affected by racial bias in health-care algorithmsAn algorithm was developed to assign risk scores to patients where it tends to assign a low-risk score to a black person than an equally sick white person.
- A Popular Algorithm Is No Better at Predicting Crimes Than Random PeopleCOMPAS (Correctional Offender Management Profiling for Alternative Sanctions) assigns a score to a person indicating the risk of committing a crime.
- Complaints about bias in Twitter image cropping algorithmUsers questioned the Twitter algorithm about the presence of gender and racial bias when the image is being cropped.
As you can see, there are multiple examples where the biased AIs worsened societal inequalities. This is concerning from a moral and also an ethical standpoint.
Source of Bias

From the perspective of us, humans, we say that the AI algorithms are flawed and are unable to address the problem of biased predictions.
In converse, an AI would argue that the model which has been built is based on actual reality. As in, the AI would learn things from the data we provide it. If the bias is inherently present in the data, then the predictive model would also be biased.
So, one source of bias in predictions comes from the data itself. For instance, consider the case of Amazon’s automated resume selector. If the input data consists of many examples of resumes from male candidates, the final predictions/ decisions would be tilted close to male candidates (since the algorithm was trained on more male candidate data than females’).
Maybe, we can adjust these biased predictions by providing the AI model with data consisting of an equal number of resumes from both genders. In the following sections, I will discuss some of the techniques to debiase the biased prediction model.
Example Dataset
I was involved in a Kaggle competition on NLP multi-class classification task where we were given descriptions of various jobs and our objective was to match them with a list of 28 different jobs like Dentist, Teacher, Lawyer etc.
Note: The data in this Kaggle competition was collected from CommonCrawl, which is free to use.
Here, we have an explicit gender feature assigned to every job description. There was some implicit gender information present in the text of job descriptions such as pronouns (he/ she), gendered words (mother, father, son, daughter etc).

From the screenshot of the dataset, we can see the gender pronouns in the job description and also a gender feature.
Let’s check the gender disparity between various jobs in the dataset.
Top 10 jobs with more female examples:

I took a ratio (male:female) of examples present in the dataset for each job title. From the image, you can see that the jobs like dietician, nurse, teacher appear to be more dominated by women.
The jobs like poet and journalist have ratios closer to 1 where we see almost equal proportions of both genders.
Top 10 jobs with more male examples:

If we visit the other end of the spectrum, we see that jobs like rapper, surgeon, dj are mostly male-dominated.
This shows the disparity that we are going to feed as input to the AI model. The model would train on these disparities and gender-sensitive features to become a biased predictive model.
Although we would come up with an accurate model which performs pretty good on the test set, it would not be a fair model.
In this case, the bias can creep into the model from:
- Gender feature
- Implicit gender information in the description
- Highly imbalanced job classes such as dietician, rapper.
Any Solutions?

There are some methods to tackle this problem, but we cannot eliminate the problem completely.
The methods that we (my team) investigated during the competition:
- Remove the gender feature and do not consider it for the training process.
- Remove all gender-specific words like pronouns, nouns, adjectives etc which have any connection to the gender.
- Debiase the word embeddings before training the model.
- Language back translation.
All in all, our objective was to eliminate the gender bias right at the source by performing the text preprocessing and at a later stage, debiasing word embeddings before we train the model.
Some python libraries which we used for debiasing:
- Debiaswe: try to make word embeddings less sexist
- Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation
Were we able to eliminate the bias completely?
Unfortunately no, not completely but we managed to reduce the bias to some extent.
Fairness Metric
For this specific dataset, we used macro disparate impact as a performance metric to calculate the fairness of the model. For each job, we take a ratio of maximum gender count to minimum gender count.

For instance, to calculate disparate impact for accountant job, we just divide maximum gender count (1992) with minimum gender count (1129). The same is repeated for all jobs.
To calculate the macro disparate impact, you just take the average of all the individual disparate impact scores.
Acknowledgement: My team was placed at 8th position in the Kaggle private leaderboard standings out of the 78 teams that participated in this competition.
Conclusion
Automating complex tasks using Artificial Intelligence is impressive but we should first address the issues that stem from this i.e, the biased predictive models. Fairness in AI has got a lot of traction in recent times and more research is being done to make the models as fair as possible. Explainable AI (XAI) is also an important topic that can point out the source of bias in the model.
There are some researches that want to improve the Fairness in AI situation:
- Fairness 360: A python library developed by IBM to mitigate the Bias and increase model fairness.
- Fairness Flow: A tool used by Facebook for the same purpose of reducing bias in AI.
- ML-Fairness Gym: A toolkit built by Google to study the fairness aspect in AI.
Finally, Fairness in AI has garnered the attention of big players in the industry and more research will be put into this topic. So it is definitely worth your time checking it out if you have not already.
If you have reached this part of the post, thank you for reading and also your attention. I hope you found the post informative and you can reach me on LinkedIn, Twitter or GitHub.