The world’s leading publication for data science, AI, and ML professionals.

Introducing data sonification as a core principle to data storytelling

Rethinking design accessibility in the world of data visualizations. The anatomy of a data visualization and visualizing for accessibility.

Hands-on Tutorials

Data journalism is an emerging field due to the amount of open data that journalists now can leverage to find new stories, extract insights and findings, and produce a powerful story with cutting edge technologies. As Meredith Levien, the current CEO at the New York Times, has said "We live in the golden age of marriage between digital affordances and quality journalism." Today I am going to share one aspect of this field – data storytelling.

Table of Contents

1. Defining data storytelling. 
2. The anatomy of an accessible data visualization. 
3. Visualization for Accessibility - data sonification with an example of representing the US population of young adults living below the poverty threshold. 

Defining data storytelling

In the datafied world there is a lot of interconnected terminologies and buzzwords in the same way like in the world of artificial intelligence. So, I thought it made sense to bring a little more clarity into it.

The Vienn Diagram on Data Storytelling created on Figma. Credit: Data Journalism and Visualization with Free Tools
The Vienn Diagram on Data Storytelling created on Figma. Credit: Data Journalism and Visualization with Free Tools

First, there’s data academia and data science. That’s my major at UC Berkeley. It entails more research-oriented and entails statistical/ decision-theoretic modeling paradigms to design systems like human-AI collaboration (i.e. self-driving cars). When we talk about academic data science, it usually contains complicated graphs that are not understandable unless you know the full context of research. "In this case, we visualize data is because we want to extract meaning from it. Historically, that’s because scientists had to pay per figure in their paper so they would cram in as much information as they could. And that would make these things complex." (Cairo) Like these:

Data in academia is beautiful in its own way!
Data in academia is beautiful in its own way!
Data in AI research paper I read to do research in RL
Data in AI research paper I read to do research in RL

On the other hand, there is data art. "The data often doesn’t look like a traditional chart or graphic at all. Instead, it’s something that looks like it belongs in a museum." (Cairo) Here is a constellation of the royal family, where you can see familial connections between all of this royalty. Another great example of this data art is one of Giorgia Lupi’s Latour projects I saw last year in one of the New York exhibitions.

Royal Constellations and "Bruises - The Data We Don't See"
Royal Constellations and "Bruises – The Data We Don’t See"

"Data storytelling itself exists in the intersection of the two. It’s sometimes described as edutainment because it both informs and delights. And what differentiates data storytelling is that the goal is to reach the broadest audience possible to communicate a clear narrative and key takeaways. A Data Visualization is a tool of data storytelling that we can use to extract insights from the numbers we see in a spreadsheet." (Cairo)

The conflict in Ethiopia by Reuters
The conflict in Ethiopia by Reuters

Data visualization maps data onto objects like pie charts, maps, bar charts, or even sound (read till the end!)The reason why it is powerful lately is because it enables us to discover certain patterns, trends, and ultimately stories we couldn’t have seen otherwise.

The anatomy of an accessible data visualization

Data visualization must be accessible to all. Therefore, it is usually made of several layers of content, according to Data Journalism and Visualization with Free Tools:

Fundamental Layers of a Data Visualization on Figma. Credit: Data Journalism and Visualization with Free Tools
Fundamental Layers of a Data Visualization on Figma. Credit: Data Journalism and Visualization with Free Tools

Scaffolding layer

This layer contains features that support the content such as labels, axes, scales. This is a data visualization produced by the Wall Street Journal in 2017, "Track National Unemployment, Job Gains and Job Losses."

"You can see the color legend on the upper right corner and little labels on the axes, the little tick marks on the axis of the chart, etc. This is the scaffolding layer. That is basically what determines how the content is going to be presented, and it puts everything sort of in contextual understanding:"

Source: http://graphics.wsj.com/job-market-tracker/ Credit: Data Journalism and Visualization with Free Tools
Source: http://graphics.wsj.com/job-market-tracker/ Credit: Data Journalism and Visualization with Free Tools

Encoding layer

It is the property of those objects that you vary according to the data – features that represent data. In data visualization we use many different kinds of methods of encoding. "One of them is height or length like in bar graphs. It’s an important layer because it shows the change of a variable over time, quite insightful metric for journalistic stories."(Cairo)

Here is an example of a recent data viz that I did, analyzing the historical data of billboard 2020 in the month of December. I use x-axis to represent how many weeks the song was on chart, y-axis – position on billboard (changes historically), and size of circles – peak position (changes depending on the week).

Billboard December 2020 created by me
Billboard December 2020 created by me

Annotation layer

"The annotation layer is the text elements that we add to visualizations to clarify data or highlight important data points in a chart and put it in context. This is important l because it enables us to effectively communicate with the public." (Cairo)

Here is a visualization of how California uses dozens of aircraft to battle wildfires produced by Reuters. Designers of this project didn’t just map out the routes of aircrafts and labeled things, but also provided the annotation below the visualization. They put into comparison the aircraft capacities in order to understand how much retardant or water can be dropped by each type of aircraft.

How California uses dozens of aircraft to battle wildfires by Reuters
How California uses dozens of aircraft to battle wildfires by Reuters
How California uses dozens of aircraft to battle wildfires by Reuters
How California uses dozens of aircraft to battle wildfires by Reuters

The "me" layer

It’s the layer that enables people to embody the data and interact with it, it’s a few personalization moments. One of such examples is this visualization by the NYTimes "Quiz: Let Us Predict Whether You’re a Democrat or a Republican" to predict your political affiliation.

"This increases engagement, and it tells you, by telling you that you are wrong or that you are right, it’s putting you inside of the data that you are presenting." (Cairo)

"Quiz: Let Us Predict Whether You're a Democrat or a Republican" by NYTimes
"Quiz: Let Us Predict Whether You’re a Democrat or a Republican" by NYTimes

Visualization for accessibility – data sonification with an example of representing the US population of young adults living below the poverty threshold.

One of the essential design principles whether it’s a traditional graphic design or digital product design is accessibility. As we are seeing more and more data visualization into our daily life, it’s important to integrate not only basic web accessibility principles, but potentially think of new ways to serve the information for the blind and visually impaired.

I’d like to share the concept of data sonification, basically turning data into sound to understand it through a sense of hearing. So, I used the UN Sustainable Development open-source data and filtered only the US cases, did some basic data cleaning procedures, started playing with data, and used my 8-year music education with the tool to play with the sound.

Production of data sonification
Production of data sonification

This is a story of the life of children below 18 living below the poverty threshold in the states that are ranked below average and put it in the context of high school dropout rate and school enrollment rate. You can hear the piano in the background that represents all the data points of children under 18 living below twice the poverty threshold, double bass as a school enrollment rate, oscillator as the general quality of higher education, and glockenspiel to represent high school dropout rate. By filtering the data of a school enrollment rate (below average), I only sonified the represented values, that’s why double bass doesn’t play all the time.

Here is the result:

https://soundcloud.com/karina-nguien/data-sonification-the-us-population-of-young-adults-living-below-the-poverty-threshold
https://soundcloud.com/karina-nguien/data-sonification-the-us-population-of-young-adults-living-below-the-poverty-threshold

Thanks for reading! I am currently designing at the New York Times and researching at UC Berkeley. You can also subscribe to my newsletter, where I share more about ethical implications of emerging tech, product design, and data-driven investigations. Learn more about my work here.

Work Cited:


Related Articles