
Introduction
This is part 2of the series of articles related to carrying out and implementing a successful Data Management Strategy within an aspiring Digital Organization.
You can find the the introduction to this series here.
In this article we will focus on the following topics:
- Data Quality
- Data Architecture
- Data Integration
These are key aspects of every Data Management initiative and each one of them will be discussed deeply. Concretely, they will be explored within the following dimensions:
- People involved (organization)
- Processes (activities)
- Technology (the minimum that technological solutions must have to develop each stage)
So, withourt further a do, let’s jump into it!
Data Quality
Achieving Data Quality is not an easy task, specially when it comes from multiple sources, on different technological formats and in very heterogeneous environments. And this tis the reality of most of current organizations.
Having them where, when and how we need them is frequently a challenge. In addition, data is often ‘dirty’: full of errors, omissions or interferences. These errors can mean the failure of Information and Telecommunication projects and the data exploitation in a company.
Unfortunately, this data layer is a critical component that is often neglected or ignored. Ensuring the quality, integrity and accuracy of data in organizations should be one of the main objectives of any data management strategy, as this is a critical factor in achieving strategic objectives.
Poor data quality has real economic repercussions. The process of Extraction, Transformation and Loading of the data, or ETL, can consume 80% of the time of development of a data project. Moreover, having data that does not represent reality accurately, means that any application developed on top of them is practically useless.
For all these reasons, it is crucial to understand the critical importance of this aspect of data strategy. The quality of your data is the quality of everything you get from it.
Data Quality Management
Data Quality Management makes reference to the methodology, politics and processes used by an organization to ensure the following some critical attributes of data in the systems and data flows within the company.
The following questions should be asked to every Critical Data Element, or CDE, so it fulfills the Data Quality requisites:
- Is it accurate? → Accuracy
- Is it valid? → Validity
- Is it the latest? → Current
- Is it complete? → Completeness
- Is it unique? → Unicity
- Is it consistent? → Consistency
Not all data quality dimensions are applicable in all Critical Data Elements or CDE’s (e.g. date of birth will define data quality for validity and completeness dimensions).
Data Quality Dimensions
To explore in depth the previous data dimensions, we should define them more throughfully.
The Data Quality Dimensions refer to the aspects or attributes of the data that can be evaluated and used to measure the quality of the data. There are 6 key dimensions:
Accuracy
It means that data represent accurately the real world. E.g: spelling mistakes.
Validity
It means that data fits the sintaxys of their definition (format, type and ranges). E.g: Incorrect values for the sex and type of client.
Current
Data represent reality from the time perspective. They are the latest available information. E.g: Change in the address of a client that occurs 1st of July and is introduced in the system on the 15th of July.
Completeness
It means that data are complete in terms of business importance. E.g: Client address that lacks postal code.
Unicity
Data correctly are correctly identified and are registered only once. E.g: unique client registered twice in the data base with different IDs.
Consistency
Data is represented in the same way across the entire dataset. E.g: Erase of the account number of a client but there is a purchase order associated to that account.
Data Quality Rules
They are business rules which objective is to ensure the compliance of data dimensions in terms of exactitud, valididy, integrity, unicity and consistency.
Let us see an example with the CDE Birth Date of a client:

Data Quality Process

Define DQ Requirements
- Perform data profiling to help discover data frequencies and formats.
- Data profiling can be done with specialized tools or query languages on data sources (SQL’s).
- Data quality problems may be discovered during profiling, but the purpose of profiling is to uncover information for data quality assessment.
DQ Evaluation
- Define data quality rules for accuracy, validity, completeness, etc. and also quality thresholds.
- Perform data quality assessment by complying with data quality rules in the existing data set.
- Identify data quality problems and update the problem record.
DQ Problem Solving
- For problems identified during the data quality assessment, perform a root cause analysis (RCA) to determine the root cause of the problem.
- Solve problems by eliminating the root cause of the problem.
- Review data policies and procedures, if necessary.
DQ Monitoring and Control
Define and develop Data Quality KPI Dashboards to perform the follow up and monitoring of data.
Main Roles in Data Quality
The Data Quality Analyst represents the key role of data quality and is responsible for carrying out the activities associated with the data quality process.
While it is the only specific data quality role, it will work closely with the business owner, data managers, technical owners, and data custodians.
Its functions include, among others, the definition of data quality rules, analysis of results, profiling, evaluation, investigation of the causes of data quality problems, etc.

Technical Tools to Ensure Data Quality
The minim requisites required are:
- Ability to perform data profiling, including statistical analysis of datasets.
- Ability to define and execute data quality rules for quality control of critical data.
- Ability to store data quality evaluations and results.
- Ability to carry out the process of solving and discovering problems.
- Ability to create and visualize data quality Scorecoard.

Data Architecture
Data architecture refers to the models, policies, rules, or standards that govern what data is collected, how it is stored, organized, and used in an organization’s systems. It covers how each function fits into the overall data management framework.

Data Architecture Roles
The data architect is primarily responsible for designing the data architecture. Although this role is specific to the data architecture, the data architect will work closely with all other data management roles.
The primary responsibilities of this role include designing and optimizing the data architecture across data architecture layers.
Data architects also propose the technologies needed to support enterprise architecture and data management functions.

Conclusion
This has been all for Part 2 of the Data Management Strategy series. We will continue to explore more about Data Architecture and the main Data integration tools in the next article.
If you already like this series so far, do not miss the introduction here.
If you liked this post then you can take a look at my other posts on Data Science and Machine Learning here.
If you want to learn more about Machine Learning, Data Science and Artificial Intelligence follow me on Medium, and stay tuned for my next posts!