Understanding the types of data in a business/organization

Try to google it, and you will guarantee to find various sources each with their own versions (some said 3 types of data, some said 5 types, some even said 13 types). We make it easy for you by summarizing them and take your comprehending into the next level

Rendy Dalimunthe
Towards Data Science

--

Before you start to rolling out your data management initiatives, be it Master Data Management, Enterprise Data Warehouse, Big Data Analytics or whatever it is, you need to start by understanding the very basic ingredient: the data. Only by thoroughly recognizing their characteristics, you will know the right way on how to treat each of them.

“Data is a precious thing and will last longer than the systems themselves”

Tim Berners-Lee

So let’s get started!

Transactional Data

This type of data describes your core business activities. If you are a trading company, this may includes the data of your purchasing and selling activities. If you are a manufacturing company, this will be your production activities data. If you are a ride-hailing or cab company, this will the trip data. In a very basic organizational operations, the data related to the activities of hiring and firing employees can also be classified as transactional data. As a result, this kind of data has a very huge volume in comparison with the other types and usually created, stored, and maintained within the operational application such as ERP system.

Master Data

It consists of key information that make up the transactional data. For example, the trip data in a cab company may contain driver, passenger, route, and fare data. The driver, passenger, locations, and basic fare data are the master data. The driver data may consists the name of the driver and all of the associated information. So does the passenger data. Together, they make up the transactional data.

Master data usually contains places (addresses, postal-coded, cities, countries), parties (customers, suppliers, employees) and things (products, assets, items, etc.). It is application-specific, meaning that its uses are specific for the application with business process related to it, e.g: the employees master data is created, stored, and maintained within the HR application.

By now, you should get some grasp of understandings that master data is relatively constant. While the transaction data is created at a lightning speed, the master data is somehow constant. The trip data is created in any second but the list of the driver will remain the same unless there’s a new driver on-board or get kicked out.

Nowadays, processes within the organization are usually so inter-dependable, which means that one process conducted in one system is related to the process conducted in other system. They may use the same master data. If each system manage their own master data, potential duplication and inconsistencies may arise. For instance, a customer may be stored as Rendy in system A, but listed as Randy in system B, although Rendy and Randy is actually the same entity. But no need to worries, there’s a discipline to manage this kind of situation. It’s called Master Data Management.

Reference Data

Reference data is a subset of master data. It is usually a standardized data that governed by certain codification (e.g. list of Countries is governed by ISO 3166–1. There’s an easy way to differentiate reference data from master data. Always remember that reference data is way less volatile than master data. Let’s back again to our cab company. Tomorrow, the day after tomorrow, or next week, the list of driver may change whenever there’s a new person onboard or kicked out. But I can guarantee you that the list of countries will remain the same even 2 decades from now, unless there’s a little land that declare its independence.

Reporting Data

It’s an aggregated data compile for the purpose of analytic and reporting. This data consist of transactional, master, and reference data. For example: Trip data (transaction + master) on the 13th day of July in Greater London region (reference). Reporting data is very strategic and usually being produced as ingredient of decision making process.

Metadata

It’s a data about data. Sounds confusing? Indeed. It’s the type of data that got me dizzy in the first time I enter the data management field. Thankfully, this beautiful picture make it easy for me to comprehend what metadata actually is.

Data & its metadata

If I ask you a question: what is the color of the cat? Immediately by just looking at the data you can confidently answer my question. It’s grey. But what if I come up with another questions: when and where this picture be taken? Chances are high that you will not be able to give me the right answer by only looking at the data. And here is where the metadata come to rescue. It gives you the complete information about the data including when and where it was taken.

So metadata is giving you the answer to any question that you cannot answer by just looking at the data. That’s why it said: data about data.

So that’s all you need to know about the types of data. Again, this is not an exhausting explanation and I can guarantee that you will not be able yet to take on that data scientist job offer only by reading this. But you can still take all these explanation with you and answer in confident whenever someone you meet on the street ask you about the common types of data in organization.

--

--

Independent researcher focusing on data management, customer experience and the application of blockchain