The world’s leading publication for data science, AI, and ML professionals.

Data Visualization: Choose Your Colours Wisely

Making Data Science inclusive for colour blind professionals and taking your data visualization skills to the next level.

Photo by Robert Katzki on Unsplash
Photo by Robert Katzki on Unsplash

As a Data Scientist in business and corporate environments, the goal is to make your work easy to understand and visually appealing, especially to non-data professionals. Graphs and colours, therefore, play a crucial role when communicating insights to business executives. So, make sure everyone in the room can understand your work.

Colour blindness is a genetic condition that makes it difficult to distinguish between specific colours, usually red and green. Colour blindness affects approximately 1 in 12 men (8%) [1]. This means that in the UK, there are about 5.6 million colour blind individuals. Globally, colour blindness affects approximately 300 million people, roughly the entire population of the United States. So, there is a good chance you will meet someone who is colour blind.

But before we move on to data visualization, can you identify the number in the image below? The Ishihara test is a famous test used to detect colour blindness [2]. I will leave the answer* at the end of this article.

Image by Shinobu Ishihara, now in public domain
Image by Shinobu Ishihara, now in public domain

Adapting Data Visualization

Given time constraints and work pressure, sometimes you might use Matplotlib or Seaborn default colour settings. However, we need to be mindful of people who have some amount of colour blindness and mitigate the possibility that someone will not understand your work because of a genetic disorder. Luckily, there are colour palettes you can use that are colour blind friendly.

For example, Colour Blind 10, released by Tableau, offers a ten-colour palette. You will notice that next to each colour strip are three numbers, separated by full-stops (or periods for those in the US and Canada). Those numbers represent a colour based on the RGB colour model [3]. The model describes how to combine the three primary colours (red, green, and blue) in different proportions to form any secondary colour. Computers output RGB colours using numbers ranging between 0 and 255. This is because 256 integer values can be represented using 8 bits [4].

Palettes replicated by the author, also found on Tableau
Palettes replicated by the author, also found on Tableau

The first colour in the left palette resembles a dark blue colour and has the following RGB values:

  • Red: 0
  • Green: 107
  • Blue: 164

Applying colour blind friendly pallets

Suppose you are about to plot a graph using Matplotlib. To specify a line colour using RGB values, we pass in a tuple of the values to the c parameter when we generate the line chart. In the following code, we scale the first colour, which resembles dark blue, in the Colour Blind 10 palette and set it as the line colour. Despite the long code, it is useful to see a real-life project using what you have just learned. The code below is written in Python on Jupyter notebook. Note the variables cb_dark_blue and cb_orange as the c parameters of ax.plot().

Code lines and Graphs created by the author using Python in a Jupyter notebook.
Code lines and Graphs created by the author using Python in a Jupyter notebook.

Unfortunately, not all colour schemes will work well for every colour-blindness condition. However, it is still possible to create your own. Notice that most colour schemes are monochromatic palettes, composed of different shades of a single colour. There are some online resources to help you pick a colour blind-friendly colour scheme. You might want to check ColorBrewer. Here are two examples created using their tools. Don’t forget to tick the colorblind safe box on the left side of your screen.

Colour blind-friendly palettes created by the author using ColorBrew
Colour blind-friendly palettes created by the author using ColorBrew

Deep Dive into Data Visualization

Suppose you want to take your data visualization skills to the next level. Then you might want to deep dive into the topic of visual display of quantitative information. That is right; you could go beyond merely plotting graphs and begin to improve your storytelling skills. There are two books you might be interested:

The Visual Display of Quantitative Information

The second edition of _The Visual Display of Quantitative Information, by Edward R. Tufte,_ provides excellent colour reproductions of the many graphics. There are over 250 illustrations of the best (and worst) statistical graphics with a step-by-step analysis of how to display your data. It includes the data-ink ratio, relational graphics, data maps, time-series, and multivariate designs. This is the classic book on statistical graphics, charts, tables – a must-have for all Data Scientists wanting to expand on their storytelling skills.

Storytelling with Data

Storytelling with Data, by Cole Nussbaumer Knaflic, shows the fundamentals of data visualization and, equally important, how to communicate your data effectively. You will learn to leverage storytelling and how to make data vital in your work.

Conclusion

Data Visualization and storytelling are fundamental parts of Data Science: these aspects make it possible to convey ideas and insights to others, especially business executives. Frequently, data professionals focus primarily on coding syntax, outputs, readability and other Data Science characteristics. As a consequence, some data professionals use default colours from data visualization libraries. However, colour blindness affects 1 in 12 men. As a result, it can make your work less visually appealing and challenging to create a compelling story. Luckily, there are alternatives to address colour blindness, such as choosing the right RGB palette or using online tools such as ColorBrew. Also, Data Scientists could improve their skills by learning about Visual Display of Quantitative Information.

Thanks for reading. Here are other articles you might like it:

Increase Productivity: Data Cleaning using Python and Pandas

How to Boost Your Coding Skills


Answer to the Ishihara test: those with complete colour vision can see the number 74. People with partial colour blindness will see the number 21. Unfortunately, those who are completely colour blind cannot see any number.

References:

[1] https://www.nhs.uk/conditions/colour-vision-deficiency/

[2] https://en.wikipedia.org/wiki/Ishihara_test

[3] https://en.wikipedia.org/wiki/RGB_color_model

[4] https://en.wikipedia.org/wiki/8-bit_color

Disclaimer:

I have no association with or received anything in return from any of the websites, authors, books, or libraries mentioned in this article.


Related Articles