Data Visualization is Overrated

Jesse Paquette
Tag.bio
Published in
4 min readMay 5, 2017

--

An all-too-common strategy of BI/analytics follows some variant of this formula:

  1. Collect data
  2. Process data
  3. Structure data
  4. Visualize data
  5. Discover insights
  6. Use insights to innovate

There’s a major problem with that strategy — did you see it? Right between #4 and #5, there’s a hidden, last-mile step that everyone seems to take for granted.

Ok, explain to me — how exactly does data visualization become insights?

I’ve worked on data visualization projects throughout my career, published data visualization papers and developed data visualization software. To be honest, I’m disappointed by the hype. There’s too much subjectivity, too much uncertainty in the process -- data visualization on its own is simply an unreliable, unscientific and ultimately shallow path to insights. It works, occasionally, but not nearly as well as some other methods (see below).

Don’t get me wrong, data visualization still has plenty to offer, just not as the silver bullet insight machine it’s assumed to be. To understand my mindset, let’s take a step back and define what we’re really trying to do in the data-to-innovation process.

We aim to translate patterns of information within data into the brains of humans

For example, check out this pretty visualization from Quartz:

How well did that visualization translate information into your brain?

If we go to the original article at Quartz, we can find a segment of statistics+text accompanying that visualization.

How well did that translate information into your brain?

One might argue that the information in the statistics+text was drawn from a wider variety of sources than in the visualization, or that the statistics+text could have been displayed at different time points on the visualization itself. Fair. But the fact remains — in that Quartz article, statistics+text did a better job of translating information into the human brain.

Did that visualization have value?

Absolutely it did. Especially in combination with the statistics+text caption.

I’ll admit there are scenarios where data visualization is more useful for generating insights

For example, visualizations of spatial data like above from FiveThirtyEight are highly useful for producing insights. There are higher-level relationships between the variables (counties) analyzed — some are next to each other, and some are far apart. I can visually group the purple counties on the right side of the map into a concept I understand as “Appalachia”.

Of course, if counties had been pre-grouped into larger regions such as “Appalachia” before the analysis, then there could have been dynamic text that said:

Death from mental and substance abuse disorders is significantly higher in Appalachia (Over 30 per 100k people, compared to a national average of less than 10 per 100k people).

That block of statistics+text represents a unit of insight I can take away, communicate and investigate.

What is an insight, really?

An insight isn’t just information, it’s information that changes one’s perspective on their business or research field. As such, any person or system that analyzes data isn’t going to know very well in advance what insights are available for discovery. This is why the current state of exploratory data analysis is so reliant on visualization — a picture is worth a thousand words, as they say.

However, the most effective insight generator -- be it data visualization, statistics, and/or text -- is one that enables a human to creatively drill down to the most pertinent information in an objective and unambiguous way.

And there’s the paradox — if insight generation is so reliant on human creativity — how can it be done objectively?

At my company, Tag.bio, we utilize data visualizations. We have to, they’re not useless. But we also have dynamically generated statistics+text to accompany every analysis -- and for our customers, this has made a world of difference. Check out some of my other articles to learn more.

--

--

Jesse Paquette
Tag.bio

Full-stack data scientist, computational biologist, and pick-up soccer junkie. Brussels and San Francisco. Opinions are mine alone.