The world’s leading publication for data science, AI, and ML professionals.

Save your data study from the dumpster

A recipe for turning numbers into useful insights

Source: Steve Johnson, Pexels
Source: Steve Johnson, Pexels

Imagine you’re the CEO of a hot new mobile payments service. Every night, you dream Jeff Bezos buys out your company. But for Jeff to care, you need to grow your user base pronto. How?

Decision-making always boils down to these sorts of questions. How should we do something? Which course of action should we take?

Yet, even when drowning in Data, it feels like you never seem to have quite what you need to make an informed decision. This article not only explains why this keeps happening, but actually offers a solution.

One thing’s for sure, it’s not numbers we’re short of. For example, at your imaginary mobile payments company you come across plenty of seemingly-insightful stats, often visualized in charts, just like this one:

It certainly tells you that, among mobile payment apps, WeChat has the largest user base.

Because you’ve been in this industry for, what, at least two months, you know that both Alipay and WeChat target mainly the Chinese market. So, looking at the relative size of the columns, you infer that there are more Chinese customers using mobile payments than elsewhere in the world.

But that’s pretty much it. Among the things this chart doesn’t tell us are:

A SUPER USEFUL MENTAL MODEL

When seeking to solve a business problem, we have some – maybe many – potential ideas as to how to solve it. We’re open to more, but what we’d really like is an indication of which ideas might work best.

To figure out the stuff we need to know to answer such questions, it’s useful to use a mental model called the DIKW pyramid.

Its name is an acronym for the four layers of stuff you can know: Data, Information, Knowledge, and Wisdom. We’ll soon break down what each of these layers means.

But what stands out immediately, is that not all types of "knowing" are the same – and some are worth more than others.

Here’s a great example: despite an explosion of commercial solutions, Marketing Analytics still fails to impress, and the contribution to company-wide performance has been consistently blah for the past six years.

Don’t you find it odd, for companies to have all this data sloshing about, yet to consistently struggle to make good use of it?

Chief Marketing Officer Survey, February 2019
Chief Marketing Officer Survey, February 2019

The researchers conducting the survey were very clear about what needs to change:

"Rather than create data and then decide what to do with it, firms should decide what to do first, and then which data they need to do it."

It seems simple enough, but when we look at the pyramid, the issue suddenly becomes obvious:

The actionable insights we are after are in the WISDOM layer, all the way at the top; but the stuff we collect in order to arrive at the insights is DATA, right there at the bottom.

So, to get meaningful answers to business questions, it’s never enough to just collect data, or even to analyze it. It’s about nailing a process to get from DATA up to WISDOM.

Having such a roadmap means you can backtrack from a business question into the sort of data you’ll be wanting. So, right at the initial stages of designing your questionnaire or deciding what data to collect, you’re able to tell what you’ll be needing later on.

With this in mind, can you imagine how great it would be if getting from one level to next was merely a matter of specific formal steps?

Well, life’s about to get great.

DATA

With data, you know nothing.

Data is merely a collection of observations: a computer server log; a list of transactions; a collection of items. Knowing nothing, any insight we can derive from data will be the outcome of processing it.

INFORMATION

We obtain information by asking quantifiable questions about the data. A crude but useful rule of thumb is to say that information is essentially counting data points:

Why or How can never be asked on data to derive information. So, by the time we’ve obtained information, we know something – but nothing useful, because we lack context.

Just look at this jumble of information points. Each seems insightful, but actually tells you very little, because you haven’t got any reference point.

Useless.
Useless.

KNOWLEDGE

We obtain knowledge when we compare and contrast two or more units or information. Notice it’s always done with bits of information, not bits of data. To see this, recall the digital payments chart.

Consider how pointless it would have been to say: WeChat has 600 million users, and Paypal logged a UK transaction on Monday 08:35GMT.

One is an information point, the other is a data point; the two don’t mesh.

Instead, we counted how many users each payment platform had. That was information, derived out of data. By ordering these results, we could tell WeChat has the largest user base. That’s knowledge – a relational fact.

More generally, the way to derive knowledge from information points is by making deductions using logical quantifiers:

Notice you can use information points derived from different data sets. For example, our chart didn’t state where each app’s user base is located. That WeChat and AliPay have mostly China-based clients was information we got elsewhere.

Be careful though! The logical deduction must be valid. For example, we cannot say the following, because this inference is false:

I see A LOT of studies, surveys and analyses falling into this trap: they present a conclusion, without ever pausing to consider alternative – yet equally valid – explanations. (The Economist is a particular repeat offender.)

You’ll appear much more insightful by laying out different possible explanations, instead of opting directly for the one you happen to prefer.

WISDOM

All the previous types of knowing something share a common trait; they can all be crushed with two simple words: So What?

By contrast, wisdom is an actionable insight. You can tell it apart because it cannot be similarly whacked down:

Wisdom is highly desirable because it’s the only outcome that turns a data study useful. It typically addresses cause-and-effect questions – like How and Why – the very questions we’d really like to answer.

There is never just one logical way to solve a business problem, since there will always be several competing possible answers. Still, here are two useful pointers on getting from knowledge to wisdom.

One, wisdom is derived out of knowledge using, again, logical quantifiers, this time between knowledge statements. As before, it’s no use trying to mix up, say, knowledge and information.

Two, you’re unlikely to get all the required knowledge from a single dataset.

Please read that last statement again. However wonderful your customer survey, your daily website traffic log or your Kaggle dataset, it’s probably not enough. You’ll need either more information (to combine into new pieces of knowledge), or pre-derived knowledge.

PUTTING IT ALL TOGETHER

By now we’ve realized that, when making decisions, we’re after the stuff at the top of the pyramid, while the data we actually collect is the stuff at the bottom. Without prior planning, odds are we’ll never end up collecting the right data to allow for the full logical deduction chain.

So, to evaluate alternative courses of action with a data-driven approach, you need to backtrack down the logical chain that would support each action:

At first, this will give you a bunch of KNOWLEDGE statements, bound together by logic. But each of these can be broken down to the series of INFORMATION statements. And those tell you exactly what the sort of things you need to be counting, directly suggesting which DATA points you should be collecting.

Then, once you’ve actually collected the data, you start following your steps. And that’s when you find out if the data, or the information, or the knowledge, actually stacks up in reality:

Maybe you thought targeting low-income customers would be a good idea, because they’d prefer payment apps due to lower transaction costs. But then you collected and counted the data, and the resulting information showed that it’s mostly middle-income folk who use payment apps.

Or maybe you though that among well-off customers, those with more credit cards would have a better attitude towards digital payments. By backtracking, you now realize you’ll want to know how many credit cards your prospects hold. Once you’ve counted and sorted the data – the knowledge you’ve gained showed it actually makes no difference.

Sure, these targeting ideas turned out to be duds. But you were able to know beforehand what data you’ll need to judge these options, because you’ve laid out the full path from the course of action all the way back to the data.

FINAL WORDS

This article almost didn’t see the light of day, even though I’ve been writing it for weeks. When I showed earlier drafts to smart readers I trust, they sort of winced:

"Too abstract". They said.

"Also, nobody cares about digital payments."

Mental Models are abstract, but that doesn’t make them any less useful. There’s unbelievable business value in knowing which questions to ask, and I stand by what I said: The DIKW pyramid model is a practical tool for generating actionable insights, because it cuts through the fog of getting from the data layer to the wisdom layer.

So I published this anyways.


Related Articles