The world’s leading publication for data science, AI, and ML professionals.

Metrics Layer: A Single Source of Truth for All KPI Definitions

Learn why implementing a metrics layer will make gathering data-driven insights much more robust in your organization!

Image generated with Midjourney
Image generated with Midjourney

Metrics layer is a framework that empowers organizations to unlock valuable insights and drive data-informed decision-making by consolidating, analyzing, and visualizing key performance indicators in a unified and intuitive manner.

In this article, we’ll explore the significance of the metrics layer, its benefits, key differences as compared to the semantics layer, and requirements for a successful implementation.

What is the metrics layer?

A metrics layer (also known as the metrics store or headless BI) is a framework for standardizing metrics, i.e., to centralize how a company calculates its metrics. It can be seen as the single source of truth when it comes to defining KPIs (or metrics, we will use those terms interchangeably) used within the organization.

💡 Bonus trivia: In case you were wondering, the term "headless BI" derives from the fact that these solutions enable various BI tools to connect to an API for accessing metrics. Consequently, they provide the flexibility to swap out tools while maintaining the integrity of metric definitions.

In essence, the concept of metrics layer is not entirely unfamiliar. For instance, you already store a project’s codebase in a central repository, versioned with Git. Similarly, the organization’s data warehouse or data lake serves as the single source of truth for all data. Analogously, the metrics layer functions as the single source of truth for the definitions of all KPIs used within the organization.

As illustrated in the schema below, the metrics layer should reside between the data warehouse (or the data source in a broader sense) and all the relevant applications (such as dashboards, reports, AI models, etc.) that consume these metrics.

Let’s further expand on this definition. The metrics layer not only stores all the metric definitions but also translates the requests generated by applications into SQL. Then, the layer executes the requests against the data warehouse/lake to retrieve the desired metrics.

Why do you need the metrics layer?

You probably have heard some variation of the following sentences in your organization:

  • Why is the value of this metric different on dashboards X, Y, and Z?
  • It appears that this dashboard is using a different definition of metric X. Can we quickly align all of our dashboards to convey the same story?
  • Somebody from management asked about the definition of this metric. Could you investigate the custom queries in this dashboard and determine how we actually calculate it?

Unfortunately, these examples illustrate the types of questions data scientists or data analysts frequently encounter during their daily jobs.

Such questions signal that metrics have become unmanageable, causing chaos for users, whether they are fellow data professionals or non-technical stakeholders.

What makes it even worse is that these users must often make critical business decisions based on these metrics.

The hidden complexity of simple metrics

As businesses grow and develop, the metrics they monitor also evolve. With the increased volume of data collected, its complexity grows as well.

What might initially seem counterintuitive is that even seemingly simple tasks like counting become challenging in Analytics, as numerous complexities arise when aggregating raw data.

To illustrate that, let’s consider a relatable example for many organizations: counting the number of users for an app or service. It should be straightforward, right?

However, the following issues may arise when attempting to count users:

  • Determining the time frame for counting users: Should it be done on a daily, weekly, monthly, yearly, or other basis?
  • Segmenting users by geographic area: If segmentation is required, what level of detail should be used? Continent, country, state, city, etc.?
  • Defining an active user: How do we identify an active user? Should a user be considered inactive if there have been no transactions after a specific period? If so, what is that specific period? Additionally, how should users who log in and use the service but make no purchases be handled? The definition of "active users" can vary significantly.
  • Applying data filters or excluding specific users: Should certain users be excluded based on specific flags? For example, should test accounts used by company employees be excluded?

Even a seemingly simple task like counting users involves numerous complexities.

Ensuring accuracy in these metrics is crucial, as inconsistent KPIs across multiple outlets, such as dashboards or reports, can make the stakeholders lose trust in the data. Moreover, it can be extremely challenging for the data team to identify all the different locations where varying and often conflicting metric definitions are used.

In such scenarios, the biggest problem is that there is no central repository for storing metric definitions. These definitions are scattered across various BI tools and custom SQL queries that populate views or dashboards. Consequently, they are often recreated and reused without proper oversight and consistency.

That’s where the metrics layer comes to your rescue. Next, let’s look at the benefits of setting up a single source of truth for your KPIs.

The advantages of the metrics layer

Implementing a metrics layer ensures that multiple individuals within the organization receive consistent answers when they ask questions about a certain metric to different data and non-data professionals.

Let’s explore some additional advantages of implementing a metrics layer.

Promotes consistency and builds trust

By enabling clear and reusable business definitions, the metrics layer fosters consistency within the organization. This consistency strengthens stakeholders’ trust in the data.

Moreover, it allows for the inspection of metric lineage – an understanding of how metrics are constructed and which data sources are used.

Embraces the DRY (Don’t Repeat Yourself) principle

Using the metrics layer eliminates the need to define the business logic for each metric in multiple locations. This avoids unnecessary repetition and ensures efficiency in managing metric definitions.

Facilitates adherence to software engineering best practices

Since the metrics layer is defined using code, it becomes easier to follow established best practices. Additionally, industry-standard solutions can be employed to version control the metrics layer, thus ensuring proper tracking.

Future-proofs data consumption outlets

With a metrics layer in place, the risk of using outdated metric definitions is mitigated across various instances. This empowers developers to build analytics features and data-powered applications, all while maintaining consistent and up-to-date metric definitions.

Supports a variety of tools

The centralized architecture of the metrics layer allows it to seamlessly integrate with a range of tools such as CRMs, BI tools, and internally developed solutions.

Regardless of the tool being used or its internal logic, the end result is based on standardized metric logic.

Provides a single interface for metrics definitions

The centralized architecture of the metrics layer offers a unified interface where all data stakeholders throughout the organization can inspect how specific metrics are defined. This promotes transparency and ensures a shared understanding of metric definitions.

Setting the stage for a successful metrics layer implementation

Having explored the what and why of a metrics layer, it’s now time to dive into the how.

Let’s look into the requirements for a successful implementation of the metrics layer. There are several off-the-shelf solutions available for implementation, each with its own strengths and weaknesses.

However, let’s take a step back and shift our focus to the key characteristics that any implementation of a metrics layer should possess in order to effectively fulfill its role within a modern data stack.

For a successful metrics layer implementation, you need five core attributes:

  • A powerful semantics layer
  • Integration capabilities
  • Performance optimization for low latency
  • Deployment flexibility
  • Enterprise capabilities

Let’s delve into the nitty-gritty of each attribute, beginning with one aspect that can often lead to confusion: the semantic layer, also known as the semantic model or logical model.

Metrics layer vs semantic layer

The semantic layer serves as a mapping between the tables and columns within the data warehouse and meaningful business entities. The semantic layer is where businesses can define dimensions, measures, and metrics using business-friendly language.

💡 It’s important to note that the semantic layer is just one component of the metrics layer and should not be mistaken for the metrics layer itself.

Ideally, these definitions should be easily crafted through an intuitive user interface (UI) and stored in version-controlled text files, usually in formats like YAML or JSON.

Furthermore, to facilitate automation, the implementation should offer a declarative API.

In addition to the strong semantic layer, there are several other attributes that are crucial for a well-rounded metrics layer implementation, as mentioned above. Let’s explore these further.

Integration capabilities

To ensure consistent metric definitions, a headless BI solution should have the flexibility to integrate with popular BI tools, programming languages, ML frameworks, and other relevant technologies.

This requires broad support for standards-based data protocols, APIs, and SDKs.

Performance optimizations for low latency

The metrics layer should be designed for high-performance querying, enabling real-time access to metrics at scale.

This is essential for powering automation features such as email triggers and personalized product experiences.

Deployment flexibility

The metrics layer should support a wide range of deployment options, including fully hosted services and cloud-native deployments across different providers.

This flexibility allows organizations to choose the deployment model that best suits their specific needs and infrastructure.

Enterprise capabilities

Considerations such as governance, security, access control, performance, scalability, and high availability are crucial for many organizations.

As the metrics layer becomes a mission-critical component for various applications, tools, and processes, it should possess enterprise-grade features to meet the organization’s requirements.


By considering these requirements, organizations can ensure a comprehensive and successful implementation of the metrics layer within their modern data stack.

Wrapping up

Many companies are still in the early stages of their data science and machine learning journeys. As such, improving Business Intelligence and reporting can address about 90% of the data-related challenges they face.

That’s why it’s crucial for these companies to establish consistent and centralized metric definitions.

The metrics layer serves as the authoritative source for all KPI definitions used within the organization. It acts as a bridge between the data source and the various applications (dashboards, reports, AI models, etc.) that rely on these metrics.

Implementing a metrics layer offers numerous benefits. It ensures consistency and trust in the data, enhances operational efficiency, promotes adherence to best practices, future-proofs data analyses, integrates with different tools, and provides stakeholders with unified access to metric definitions.

By leveraging a metrics layer, companies can improve the precision, reliability, and overall effectiveness of their data-driven decision-making processes.

Liked the article? Become a Medium member to continue learning by reading without limits. If you use this link to become a member, you will support me at no extra cost to you. Thanks in advance and see you around!

You might also be interested in one of the following:

A Comprehensive Guide on Interaction Terms in Time Series Forecasting

Unlocking the Power of Interaction Terms in Linear Regression

The Minimalist’s Guide to Experiment Tracking with DVC

A Comprehensive Overview of Regression Evaluation Metrics

References


All images, unless noted otherwise, are by the author.

Originally published at Atlan’s blog on August 2nd, 2023


Related Articles