3 rules for Interactive data visualization

A showcase with R and Highcharts

Vlad Kozhevnikov
Towards Data Science

--

Why you should make visualisations interactive.

There are a couple of reasons why interactive data visualizations are better:

  • More information. Interactivity allows you to embed much more information than in a static visualisation by using tooltips, click-events, ability to filter etc.
  • Easier perception of more complex data. The possibility to show some information only on demand and hide when not needed will allow users to focus only on the details they are looking for. Thus, potentially containing much more information, the visualization can give a quick grasp of the key points.
  • Promotes exploration. Everyone loves to interact with the objects they investigate. Responsive data encourages people to explore more and, subsequently, receive more insights.
  • More fun. Well, just as it is: being tired of dozens of static PowerPoint presentations or just dull charts interactivity brings a bit of diversity into our (corporate) life :).

3 rules for perfect interactive visualizations.

Good UX with interactive data visualization relies on 3 primary rules:

  • Overview first,
  • Zoom and filter,
  • Then details-on-demand.

This holy trinity is also known as Shneiderman’s Visualization Mantra, who formulated it back in 1996.

To demonstrate the real power of the Mantra, let’s explore some relationship using not very standard chart. I looked into shipment of smartphones, PCs and tablets in 2017 by main vendors. And I would love to try that on Sankey chart, which, in my opinion, is pretty powerful for showing not just flows, but relationships as well.

First, some data wrangling. Initial files probably have all the main problems for which xls format is hated: multiple tabs, titles, indents and very untidy format. After limiting the files to 1 tab and transforming them into .csv, for all subsequent data wrangling I used R, which is ultimately my favourite.

For Sankey chart I will use Highcharter library in R, made by Joshua Kunst as an R wrapper for Highcharts javascript library. You actually don’t need any Javascript knowledge, but it can be extended significantly with just a bit of JS.

The first straightforward default chart looks like this:

highchart() %>%
hc_add_series(data = sales, type = "sankey",
hcaes(from = product, to = company, weight = units))

Doesn’t look bad, but it can be even greater just by following the 3 step Mantra, so let’s go ahead!

Overview first

Our original chart is a bit too detailed, coloured flows show every dependency which is a bit more than “overview first”. Moreover, it does not give you that easy first grasp of what is going on. To deal with that let’s so far use one colour not to distract by details and order products and companies in descending order, so one can immediately see the ranking.

In addition to it I changed the “curveness” and made the flows more transparent, overwriting the default values for that:

hc_plotOptions(sankey = list(
colorByPoint = FALSE,
curveFactor = 0.5,
linkOpacity = 0.33
))

Zoom and filter

Sometimes, parts of the Mantra do not apply for a particular chart or specific data. Our Sankey chart is not that big to zoom and filtering values can negatively influence comparison of companies.

On the other hand, highlighting the objects of interest serves the same purpose. Each flow is highlighted by default. I would love to be able to highlight every product or company to see in/out coming flows.

This is the part where we need pure JavaScript inside R to configure events on hovering the mouse over a node:

hc_plotOptions(sankey = list(
point = list(
events = list(
mouseOver = JS(
"function () { if (this.isNode) {
this.linksFrom.forEach(function(l) {
l.setState('hover');
});
this.linksTo.forEach(function(l) {
l.setState('hover');
});
}
} "

),
mouseOut = JS(
"function () { if (this.isNode) {
this.linksFrom.forEach(function(l) {
l.setState('');
});
this.linksTo.forEach(function(l) {
l.setState('');
});
}
} "

)
)
)
)

However, if we used another chart type, for example, stacked bar, then filtering out would work pretty nicely:

highchart() %>%
hc_add_series(data = sales_stacked,
type = "column",
hcaes(x = product, y = units, group = company)) %>%
hc_plotOptions(column = list(stacking = "normal")) %>%
hc_xAxis(categories = unique(sales_stacked$product))

By (un)clicking on legend you can filter companies and this is default behaviour of bar charts in Highcharter. So the chart can be filtered to render only the companies of interest.

Details-on-demand

Here we will use the full power of interactivity to make the visualisation just perfect. Which details the user might be interested in? It can be:

  • Number of units shipped by vendor and product.
  • Number of units of a particular product shipped by a particular company.
  • Share of the product in company’s shipments and share of a company in the sales of a particular product.

First two are provided by default behaviour of Highcharts in a tooltip. The latter should be written by us:

hc_plotOptions(sankey = list(
tooltip = list(
pointFormat = "{point.fromNode.name} → {point.toNode.name}: <b>{point.weight}</b> Mio units<br/>
{point.product} contribute <b>{point.product_share} %</b> in {point.company} sales: <br/>
{point.company} contributes <b>{point.company_share} %</b> in {point.product} sales "
)
)
)

I would also add an option to download the chart as an image if stakeholders need it in static format:

hc_exporting(enabled = TRUE,
buttons = list(
contextButton = list(
y = -30
)
))%>%
hc_chart(spacingTop = 30)

The final result is:

You can see throughout the article how applying the Mantra improved the chart significantly, now it looks just fine.

Moreover, you probably noticed that the last chart, which is responsive, attracts your attention more and you might be more eager to explore it compared to ordinary gifs before. That just shows the power of interactivity.

You can find all code on GitHub.

The last chart was written completely in JavaScript, but only for the purpose of embedding into Medium. You can code that almost completely in R, adding just a bit of JS for events, as shown on my GitHub repo. Very often, you will be able to find the needed pieces of code just with a small help of Google and stackoverflow without going deep into JS.

--

--