The world’s leading publication for data science, AI, and ML professionals.

Discriminatory AI explained with an example

Discriminatory AI demonstrated with examples and visuals

Photo on right by Divya Agrawal on Unsplash and image on left is by author
Photo on right by Divya Agrawal on Unsplash and image on left is by author

AI is increasingly used in making decisions that impact us directly such as job applications, our credit rating, match-making on dating sites. So it is important that AI is non-discriminatory and that decisions do not favor certain races, gender, the color of skin.

Discriminatory AI is a very wide subject going beyond purely technical aspects. However, to make it easily understandable, I will demonstrate how discriminatory AI looks using examples and visuals. This will give you a way to spot a discriminatory AI.

Context

Let me first establish the context of the example. The business context is related to the credit loan process in banking. Let us assume that the bank has built an AI which will look at loan application data and decide whether to approve or reject it.

Pictorial representation of Business content (image by author)
Pictorial representation of Business content (image by author)

The loan application data has the following information as Income, Gender, profession, number of children, whether the person owns assets such as a house or car.

First look at discriminatory AI

The data on which AI has been trained also has historical information on previous decisions made on whether the loan was accepted or not.

Snapshot of Training Data (image by author)
Snapshot of Training Data (image by author)

Let us assume that the AI has been trained on a decision tree algorithm, which is shown below. Each node in the decision tree represents the data columns and a threshold value.

Loan approval AI model based on decision tree (image by author)
Loan approval AI model based on decision tree (image by author)

The output of this decision tree is YES or NO indicating if the credit loan got accepted or was rejected. Now when you see nodes such as GENDER_FEMALE or GENDER_MALE, this is already giving an indication that the AI model is differentiating between male and female to make the decision on loan approval. This is a possible sign that this AI is discriminatory.

Confirming the discriminatory nature of AI

Now let us confirm if AI is discriminatory or not by taking some examples. Let us take the two applications. One is from a female who is a nurse by profession, and another is a male and is a sales manager by profession.

Examples to confirm discriminatory behavior of AI. Only sample fields are shown (image by author)
Examples to confirm discriminatory behavior of AI. Only sample fields are shown (image by author)

The salary levels are more or less equal, with females having a slightly higher salary than the male applicant. Both applicants then own a car and a house.

Let us now make loan approval predictions for both applicants. The decision path for the female applicant is shown below. The decision is NOT to approve the loan. You can see that one of the decision nodes in the path is that the gender is female.

Decision path for the female candidate (image by author)
Decision path for the female candidate (image by author)

So now let us predict for the male applicant. Shown below is the decision path. Surprise, Surprise the AI decision is to approve and give a big yes.

Decision path for the male candidate (image by author)
Decision path for the male candidate (image by author)

Even though both applicants have similar income levels and own similar assets, the female applicant got rejected and the male applicant got approved.

This is a clear sign that this is a discriminatory AI.

Understanding why this AI is discriminatory

Let us deep dive to see understand why AI took such discriminatory decisions using SHAP (Shapely Additive Explanation) Analysis. Here is the SHAP analysis for the female applicant.

SHAP Analysis for female applicant
SHAP Analysis for female applicant

The blue bar indicates the factors which contribute positively towards a credit approval and the red bar indicates factors that contribute negatively towards credit approval. The income and owning a house are contributing positively, but being female has a negative impact.

Here is the SHAP analysis of the male applicant.

SHAP Analysis for male applicant
SHAP Analysis for male applicant

For the male applicant, Income and housing are contributing positively, but being a male is also contributing positively. So clearly being a female is not helping the female applicant.

The root cause of discriminatory behavior

Let us now see the root cause behind discriminatory behavior. Here is a bar plot of data on which AI was trained. The X-axis is the gender, and Y-axis is the number of credit loan approval in training data for AI.

Bias in training data
Bias in training data

The number of male applicants who are approved in the training data is almost 3 times more than female applicants. This is called ‘bias’ and which is leading to a discriminatory AI.

Gender-less model – A Possible Solution?

We can attempt to create a model without gender. Let us see if it helps. Here is the decision tree model without the gender. It does not have a gender feature in the decision paths.

Decision Tree model without the gender (image by author)
Decision Tree model without the gender (image by author)

Here is the decision path for the female candidate (without considering the gender field). The decision is still not to approve the credit loan.

Decision path for female candidate on model without the gender (image by author)
Decision path for female candidate on model without the gender (image by author)

Here is the decision path for the male candidate (without considering the gender field). The decision is to approve the credit loan.

Decision path for a male candidate on the model without the gender (image by author)
Decision path for a male candidate on the model without the gender (image by author)

So nothing has changed even we removed the gender. Why is this happening? Now let us once again analyze the historical data on which AI was trained using a heatmap shown here.

Heatmap of number of approvals in training data
Heatmap of number of approvals in training data

The X-axis of the heatmap is gender and Y-axis is the profession. We see that the data on which AI has been trained has more females who are nurses and more males who are salespeople.

As the female candidate is a nurse, the application is getting rejected by AI. In the same way, as the male candidate is a salesperson, the application is getting accepted by AI.

So even if we removed the field of gender, the profession is acting as a substitute for gender.

Thus, both models with or without gender are leading to the same discriminatory results.

Equal representation of genders in the data

The solution to having discriminatory AI is in having an equal representation of both genders in the data which is used to train the AI.

The principal of gender equality needs to be applied to data also.

Additional Resources

Website

You can visit my website to make analytics with zero coding. https://experiencedatascience.com

Please subscribe to stay informed whenever I release a new story.

Get an email whenever Pranay Dave publishes.

You can also join Medium with my referral link

Join Medium with my referral link – Pranay Dave

Youtube channel

Here is link to my YouTube channel https://www.youtube.com/c/DataScienceDemonstrated


Related Articles