You know, sometimes I get so used to the big data of it all, it doesn’t seem that exciting. Until you do something that reminds you why you do it.
Today, I got to present our work to the executives of a large organisation. It was the first time they were hearing the story of how we’re getting grip on their entire data landscape. Structured and unstructured data. It’s not an easy story to tell. So you just start the the beginning.

Several laws and regulations around privacy mandate that organisations must know where they store personal data. That’s not an easy question. To answer it, you must harmonise the physical infrastructure with the business. And many business people prefer an attachment in an email over a links to shared drive. They won’t want to talk about databases and file servers.
When mapping information to physical datastores, there are so many factors to consider. Starting at the beginning, the first question is: what are your enterprise data sources?
- Systems & Applications (ERP, CRM, finance, HR, etc.)
- Databases (SQL, Oracle, etc.)
- Email servers (i.e. Exchange)
- Shared drives (Network folders, local drives, Sharepoint, OneDrive, Teams, etc.)
The list of examples gives it away, the latter weighs the heaviest. Depending on the industry, the email server may actually contain more information. Understanding all the information these data sources contain, how do you get started? Does anyone know a good story about the location of physical databases?

At this level – although vague – it’s still quite clear what we’re talking about. The business will understand whether they store information in a specific application, on a shared drive or in their mailbox. To make this concrete at an enterprise level, we’ve got to start talking about servers, server names and IP addresses. This can quickly become too technical for a large audience.
When doing this exercise, maybe evaluate how much of your data is being hosted by third parties and why.
The quest for understanding doesn’t end there, though. To really understand the data, a deeper level of detail is required. Systems and applications are linked to databases. Databases are broken into data tables. Data tables into data attributes (columns). File servers are split up into fileshares. Fileshares into folders. Folders contain data objects (e.g. a document or spreadsheet). It’s in those attributes and objects, that enterprise data lives.
The cool thing about this is, if you map the actual information inside your enterprise data sources, you can then roll up your findings all the way to the highest level. You’ll know exactly what applications require extra security measures and protect them accordingly. There have been so many cyber attacks in the last year.
But I’d like to end on a positive note. This exercise isn’t just legal and security focused. By mapping your enterprise data sources, not only can you identify where you store personal information, you can also curate content around themes for Knowledge Management. "Hey enterprise tell me, what do we know about artificial intelligence?" Or product numbers for product information management. Product information that almost updates itself. It’s cool stuff.
How are you curating your enterprise information?