Photo by kkolosov on Pixabay

This is how the data that will change your business is processed

From questions to answers, an explanation of the complete data analysis process

Samuel Fraga Mateos
Towards Data Science
7 min readFeb 22, 2021

--

In my last article, What is Business Analytics? An understandable explanation of what analytics is and what types there are, we defined that word that appears everywhere today: analytics. Also, we have identified 5 different types, which has helped us better understand how we can find answers to different question types by exploring our data. However, extracting value from data is not a single-step process. To extract the right insights from data, we have to go through a process that allows our firms to extract the diamond from the rock.

Photo by Dion Beetson on Unsplash

As I have always said, this process is not only about technology, the business and technology sides must always be aligned. Technology must understand the business needs and the business side needs to understand how the technology works at high level.

In this article, we’re going to dive a little bit into the analytics process landscape to understand what the main blocks are and who is mainly responsible for each step.

The big picture

The data analysis big picture

The actual literature suggests performing the data process analysis in these 6 different steps:

  1. Identify questions
  2. Acquire data
  3. Prepare data
  4. Explore data
  5. Visualise outcomes
  6. Take action

All of these steps, at the same time, are split into different substeps but, in this article, we are not going to go so deep. I want you to understand the main objectives and who is the principal responsible for each step.

❓Identifying the questions

If we want to get good answers, we must ask the right questions. No secrets, the better our questions, the better our answers. It won’t do any good to have a large amount of data, heterogeneous and of excellent quality if we don’t ask the right questions.

So in this first step, we must understand our business problems and needs. This task, naturally, falls on the business side.

Depending on the analysis we perform, we may know what we are going to find to a greater or lesser extent, but it’s always important to narrow down the analysis. Otherwise, we may not reach any conclusions or obtain useful insights for our business. The key to success: focus. So, always keep in mind the objectives, maybe solving a problem, identifying an opportunity, or forecasting the results of an activity.

🛢Acquiring data

This is where technology enters the picture, although this time it is still guided by the business. Once we have understood the objectives of our analysis, we must identify where we can gather the data we need to achieve our goals.

And why is the business still participating? Because the data sources could be endless and we must identify first which ones are the best to get data from.

As we have already commented on previous articles, the data analysis process (especially when we generate predictions or identify actions) generates better outcomes when the data follows the 5 principles of Big Data: volume, velocity, variety, veracity, and value. Even if the next steps are to prepare, clean, and explore the data, there is no magic. As I always say:

The decisions we make are no better than the data on which they are based.

Therefore, in this step, it is especially important to identify the data sources that can contribute the most to us.

I think it is positive, whenever possible, to mix internal and external data sources since they provide us with an overview that will be of great value and will allow us to obtain richer conclusions.

Examples of internal data sources could be:

  • CRM
  • Business databases
  • The activity of our applications

Examples of external data sources could be:

  • Market research
  • Customer feedback
  • Market trends

👨🏼‍🍳 Preparing data

We got to the kitchen, where technology is the major ingredient. This is the most difficult step in the entire process. If you ask any data scientist, they will surely tell you that this is the most cumbersome part of the process. Why? Because we have to clean and prepare all the data for the exploration that will occur on the next step.

Unfortunately for us, perfect data sources don’t exist. So in this step, it is necessary to deal with data that presents a set of deficiencies. If you don’t know what I’m talking about, I recommend you to read my article The 6 dimensions of Data Quality: understanding poor data quality and where it comes from.

Besides, data preparation can be different depending on the way we receive the data. At this point, we must understand that the data can arrive in our systems in two different ways:

  • Batch: the most common, when we talk about batch workloads we refer to processing data from the last hour, the last day, or the last month, for example. With batch, we are getting data from the past and processing it.
  • Real-time: the buzzword, we process the data as it reaches us. It is important to note that, in many cases, organizations are not equipped to handle data in real-time. Also, processing streaming data often requires a more powerful technology infrastructure that implies higher costs.

One is not better than the other, and by using one or the other we are not going to obtain better conclusions or make better decisions. It all depends, referring to step 1, to the business questions that we want to answer.

Finally, once we receive the data in our systems, we will need to clean, homogenize, review and prepare our data for the analysis that we will do next. In essence, we have to discriminate what data is good and what data is not good to perform the analysis. All this will be done under the data quality and governance policies that should exist in our firm.

💎 Exploring data

Now, it is time to perform the analysis of the data that we have prepared in the previous step. I would not like to repeat myself, so if you have not done it already, I invite you to read my previous article What is Business Analytics? An understandable explanation of what analytics is and what types there are.

There 5 types of analytics:

  • Descriptive analytics: allows us to understand what has happened in our business. In other words, it is a look at the past.
  • Diagnostic analytics: allows us to understand why a certain situation has happened.
  • Predictive analytics: aims to answer the question of what might happen.
  • Prescriptive analytics: allows us to determine the next actions to be carried out based on the data.
  • Cognitive analytics: aims to obtain inferences from the data, as well as from past inferences obtained, allowing itself to generate an autonomous learning loop.

At this point, based on the questions we have posed in step 1, different types of data analysis will be carried out, which will yield different conclusions that will answer our business questions.

📈 Visualise outcomes

As I’m sure you already know, human beings are really bad at processing and understanding the information our brain receives. For this reason, data visualisation is a very important field whose main objective, beyond “painting graphs”, is to make it easier for people to understand the message carried by the data.
I would like to go deeper into this topic in future articles since I believe that it’s a really important field to take into account and value, on which the importance of correctly understanding the conclusions generated from data lies.
Furthermore, a good dashboard should perfectly summarize the conclusions obtained, since our attention span is limited. We must focus and be able to highlight what is important versus what is not.
One of the biggest challenges that we can find when visualising a set of data is that the business person who is going to visualise that graph does not understand the message or the usefulness. If this happens, visualisation is useless and it is necessary to find another way to better represent the idea conveyed by the data.
It is also especially important that the person who is engaged in data visualisation understands its purpose. To do this, you must understand the information you are working with, what the purpose of that visualisation is, and, above all, who it is intended for.

💍 Take action

So far we have seen the process that data goes through until it is visualised and exploited by the business side. However, this entire process will have been in vain if there is no further action. As we discussed, the ultimate goal of data analysis is to enable well-founded and proven decision-making. If after all the process, this decision cannot be made clear, we have probably made a mistake in the previous phases. The truly important thing about working with data is not the knowledge that may be hidden in it, but the impact that this knowledge obtained has on our business.
At this point, the key lies on the balance between analytical capacity and business knowledge. If these two factors are supported by a good visualisation, they will enable decision-making.

This is where the data analysis process ends. In this article we have explored each one of the phases that the data goes through, from the conception of the question to be answered to the visualisation and understanding of the message thrown by the data.
All the phases are important, it is not possible to skip any, as the result will be disastrous. There are huge risks for our business if we make data decisions that have not followed this process rigorously.
Although this whole process may seem obvious, the reality is that it is not always followed. Every day we are exposed, much more frequently than we think, to lots of information that does not convey the message properly and leads us to wrong conclusions. Asking the wrong questions, insignificant data, or poor analysis or visualisation can cause great confusion.

--

--

Restorative, entrepreneur, strategic, and futurist. Engineer with a technological background who love business, finance, and strategy.