During my industry career, I have been working in various data-centric roles such as Data Management, Data Engineering and Data Analysis. I was thinking about what went well and what was rather challenging and how people and their skills (also mine!) influenced the success of data projects. Based on these experiences, I have compiled the following list of important skills that make a good data manager! Please note that it is not based on an empirical study and I am not writing as a researcher here – this list simply aggregates some personal observations – so if you see further skills, just add them to the comments 🙂
1 Modelling
"A picture is worth a thousand words". This is also true in data management. Every data project hits the point where someone in the project needs to draw something that represents either the status quo or that catches the different visions of stakeholders. So, please consider a model as a key technique for efficient communication within a data project! There are really sophisticated modelling tools around that support established modelling languages. And this second part is even more important: please apply established modelling languages! One of the biggest mistakes you can make is to try to put every aspect of the data project into one chaotic PowerPoint diagram. You will end up with a monster that is neither a good data model nor a good process model. Instead, be aware that your data project has dynamic aspects ( → candidates for a process model) and static aspects (candidates for a data model or deployment model) that must surly fit together, but not necessarily within one diagram. There are established languages (e.g. BPMN, UML, Chen) and tools (e.g. visual paradigm,modelio, Enterprise Architect) that support syntax-conform language application or round-trip engineering. A model is something like a construction plan for physical buildings, so make sure that your plan is mature to avoid sloping floors and open walls in your data architecture caused by people in the project who interpret the model in different ways. And in the age of IoT, fuzzy data models have implications on the physical world, too. They limit the possibility of comparative data analysis or make it difficult to scale use cases.
2 Connecting People
The role of a data manager in an organization is tricky. This person is often neither an IT guy who implements databases on his/her own, nor a business guy who is actually responsible for data or processes (that’s rather a Data Steward’s area of responsibility). So what’s the real value-add of a data manager (or even a data management department)? In my opinion, you need someone who is building bridges between the different data stakeholders on a methodical level. It’s rather easy to find people who consider themselves as experts for a particular business area, data analysis method or IT tool, but it is rather complicated to find one person who is willing to connect all these people and to organize their competencies as it is often required in data projects. So what I am referring to are Skills like networking, project management, stakeholder management and change management HIwhich are required to build a data community step-by-step as backbone for Data Governance. Without people, a data manager will fail! So in my opinion, a recruiter who seeks for data managers should not only challenge technical skills but also these people skills. Likewise, if you are interested in a data manager position, judge yourself if you feel comfortable with a job of various origanizational responsibilities beyond technical topics.
3 Domain Expertise
I have seen various approaches to build data management organizations. There were closed teams as part of bigger business departments with global accountability for data standards, or federated organizations with a small core team and "dotted-line" data managers in other departments, or data management teams that were part of the IT department. Independent of the organization’s home – my experience shows that a good data manager should have fundamental domain knowledge. If a person acts as data manager for a certain business area, she/he should understand what’s going on in the value chain, what are the critical business issues, which data exists for which purpose and what are the business processes that create and consume data and how they are connected. With this knowledge, a data manager can always think one step further if she/he receives a data change request. Maybe there are other requests in the pipeline which could be merged, maybe you have own ideas regarding the process implementation, maybe you suggest other stakeholders to involve ( → see #2), maybe there are policies (e.g. DSGVO) that must be implemented, maybe you can influence the sustainability of data models beyond this single request, … the list is long! Assuming that a data manager originally has an IT background, the creation of satisfactory domain knowledge could be a challenging task, simply because it’s industry-, company- or even department-specific and you cannot simply listen to an audio book to close that gap. However, keep calm and accept that there will be a learning curve over time, but it is essential that you are willing to learn and adopt that non-IT knowledge.
4 Technology Awareness
Refining a data change request from what a business stakeholder expects to what should be really implemented is a process. In my "home" industry, the semiconductors, also business experts often have an advanced IT knowledge – this is curse and blessing at the same time 🙂 On the one hand, you can dive deeper into technical topics with all the stakeholders, on the other hand, they tend to come up with IT solution proposals instead of problem descriptions, which is not so welcome among the IT experts, because it somehow skips their responsibility. Besides the injured sense of honor, there are good reasons why building an implementation concept is separated from the requirements analysis. One of them is that there is not "the one" solution but various options with pros and cons. As important as the domain expertise is the awareness of up-to-date data management technologies and their capabilities. For instance, not every piece of master data must be entered manually, even if the data steward expects that. Maybe there are attributes from a record that can be deduced by a business rule system or filled via data integration from other systems. Or, a problem could be solved by changing the task order in a business process, which could be supported and forced by workflows. Also data catalogs have some powerful features that help data managers in their overarching role – please view my series on a particular tool to get more information. Especially if data managers have a non-IT background, this skill can be challenging. Fortunately, the technical knowledge is often not company-specific, hence, the gap can be closed with official material (videos, MOOCs, books, articles, …). From a didactic perspective, I would recommend to define a learning path for junior data managers that provides overview and insight into selected technologies.
5 Lean Thinking
Last but not least, I made good experiences with applying the lean philosophy. "Lean" is typically associated to the manufacturing business, but can also be applied for data management. For instance, you can use "Poka Yoke" principles to design data processes or GUIs to avoid certain data quality issues by design. Or, you could apply "5S" to clean-up your data architecture, where data valuation helps to judge which data objects in your organization are "waste" and which are business-critical – we have published a research paper that proposes such an approach. Especially in larger data projects, e.g. when you want to integrate a whole domain into a master data management system, it’s worth to organize "Kaizen" events, where all the stakeholders work together in a workshop mode to shape the future (conceptual) data landscape. A great method is also "PDCA" which helps to improve data quality iteratively – and there are still more lean methods that I have already applied in data projects and that are useful! In my opinion, the lean approach is the key to gain sustainable data architectures. This sustainability effect can be observed as well in traditional applications of lean management as discussed in this (German) article. The generic types of waste in a value chain can also be found in data architectures, such as motion (e.g. unnecessary manual efforts for data maintenance) and defects (e.g. wrong or missing data that limits the effectiveness of a data product). All in all, a probably underrated skill, but a great topic for further research!
Conclusion
With this story, I wanted to share my view on the optimum skill set of a data manager. A good data manager is way more than a technical expert for data management software. Typically, data management initiatives succeed (or fail) due to the (missing) support of people in an organization. Hence, there are a number of non-technical skills such as stakeholder management or project management that should be learned and applied especially in more complex data projects. Nonetheless, IT knowledge is also important especially to resolve the waste in data architectures that could be found via the lean methods. Domain knowledge is something that you won’t learn at university but over time by actually working in a certain business, but you should be open-minded for that knowledge as well and you can actively influence the slope of your learning curve e.g. by interviewing experts! And don’t forget to apply established modelling techniques in your data projects! The more people in your organization get used to good models, the more efficient will be the future communication in data projects.