How to build up a Data Driven Company

To understand new topics such as Data Science, Big Data and Co., I would recommend to read and understand a figure that is called the Data Science Pyramid[1][2]. The described chart can be very helpful for every consultant and data-driven employee and help with various questions such as:
- Where does a company currently stand on data-related topics?
- What are the big issues that should be addressed?
- What can be the next stages of development?
- What are the prerequisites? …
Therefore, I want to provide an easy-to-adopt framework in the following article.
The Wisdom Pyramid
The first step can actually be considered previously, before looking at the actual wisdom pyramid. Firstly, of course, the company must build a suitable infrastructure. In other words, a scalable and yet often more cost-effective cloud solution should be established. Here, especially the big corporations like AWS, Google and Azure offer the best conditions. For smaller and medium-sized companies, this is often the more suitable way than building everything on their own.

The subsequent stages build on each other and can be seen as separate evolutionary steps. In the agile sense, we are talking about topics, here in any case, in the classic sense we are talking about projects – i.e. larger tasks that need to be tackled in the company.
Level 1: Data Sources
After that, the establishment of stable data pipelines is on the agenda. Here, I also recommend using well-known and solid tools such as Data Prep, talend or Alteryx. The resulting ETL and ELT processes should be backed up and monitored when they go live – more about that topic can be read here [3]. You should always keep in mind, that data quality is a very important topic. Otherwise, you might experience a lack of trust from users and business departments.
Level 2: Data Warehousing and Aggregation
With new paradigms like ELT in addition to easy-to-implement and scalable cloud services, you can set up Data Warehouse, Lakes and Hubs very quickly. Building a classical On-Premise Data Warehouse via ETL processes and OLAP cubes often ends in delayed projects, additional costs and headaches for IT managers. New techniques, methods and the cloud give us possibilities to reduce setup times and provide a more flexible and scalable solution. Moreover, shadow IT should be avoided and functions shifted from workarounds in Excel or other solutions into the Data Warehouse and BI tools. When planning the architecture for your Data Warehouse or Lake, you should also stick to best practices and follow excepted layer models.
Level 3: Data Exploration
The Data Exploration process pursues the goal of searching for and analyzing the implied, valuable information in data. Especially with larger amounts of data (Big Data) the problem is exacerbated. The challenge of getting an overview of the data to be visualized without losing anything interesting is growing.

Data analysts and scientists then work with tools such as Power Bi, Tableau and other systems to gain new insights or visualize information in a way that is easy to understand. In summary, these problems are aggregated today, especially as challenges with Big Data. Visual Data Exploration is valuable in the case, as it greatly facilitates the insight into the data and the derivation of knowledge.
Level 4: Machine Learning and Data Science
The fourth step includes the automatization of the previously gained insights and, if necessary, enrich them with self-learning systems. Today, many companies still work with rudimentary AI systems, some of which they have developed themselves and some of which they have bought in. In the future, these systems will be optimized, replaced by better ones, or completely rethought. This requires good project planning. In an AI system, the algorithm may be the scientifically more interesting part, but in practice it is primarily a matter of integrating the system into the IT infrastructure and the business process [4].
Level 5: Wisdom
„Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in infomation?" – Thomas Stearns Eliot
The business model of a fully data-driven company is based on monetizing of the information. There are different approaches for companies to use data for their business cases. Besides the strategical usage of data, corporations can also make use of explorative approaches such as generating new ideas or even evolutional approaches for e.g. a Data Warehouse modernization or implement a self service BI. No matter what approach a company may use for gaining insights from their data, the procedure is inevitable because only by understanding and analyzing data, a business can successfully fulfill the needs of their customers.
Conclusion
Becoming a data-driven company and making decisions based on data requires five steps that are described in the so called wisdom pyramid. It can even take up six steps, if a suitable infrastructural structure has not yet been created. The single steps build on each other and should be tackled one after the other, as each is a prerequisite for the next step. This framework helped and still helps me to analyze companies in their respective phase and to plan the next steps, but also to notice problems.
Sources and Further Readings
[1] John D. Kellner, Brendan Tierney, Data Science S.54–58(2018)
[2] Rowley, Jennifer, The wisdom hierarchy: representations of the DIKW hierarchy, _Journal of Information and Communication Science S._163–180(2007)
[3] Christian Lauer, Data Integration – Things to Consider (2021)
[4] t3n, Deep Learning und Data-Science für Einsteiger (2020)