Data Teams as Support Teams

Improving communication, reducing friction, and helping the business win

Published in

Towards Data Science

7 min readJan 5, 2023

During a recent department meeting, I brought up a pain point of our data team to our VP. Our customers don’t seem to know or care how much of a burden their requests put on us. Most people who’ve worked in data have had a conversation with a stakeholder where the ask is days if not weeks worth of work, but the expectation is that this is trivial. Furthermore, after putting in substantial effort, we find out that the original request was either unnecessary or unrelated to our customer’s actual problem. Essentially, my thesis is that there’s a misalignment of incentives between data teams and our customers.

My VP had a different point of view. He suggested that our business partners should focus on what they do best, and when they need data to execute on their vision, they should put in the request to us. They shouldn’t have to make the call as to whether what they need is “too expensive” or “too time-consuming.”

And my VP isn’t wrong. I doubt any of us who have utilized an internal service organization like HR, legal, IT, or procurement have thought about the complexity or cost of our requests. How long could proof of income for a loan possibly take? What does it take to get my broken laptop replaced? These concerns are wholly absorbed by their respective organizations, and the business as a whole runs more efficiently that way.

Even with all of this in mind, I can’t shake the feeling that data is different. I’m not suggesting that data as a space is generally more difficult, more valuable, or more complex than these other support functions; quite the contrary, I think because the space is relatively immature, a lot of processes and concepts are still missing or half-baked. On the whole, I think data teams tend to have weak operating models supporting questionable value propositions. And at the core, there are two flaws: the data model and the analytics value proposition.

Data Modeling

I’m a huge fan of Chad Sanderson, and his constant push for businesses to prioritize data modeling is one of my core beliefs. To be clear, there’s always data modeling; whether that’s a shared exercise (when done in a warehouse or lakehouse) or an individual one (in a single report or dashboard) is the fundamental question. The lack of core data models leads to multiple understandings of the business, a situation that leads to misalignment across teams.

Furthermore, without well-defined semantics, the relationship between the data and the business processes they underpin becomes extremely difficult to understand. Even experienced analysts can spend vast amounts of time trying to tease out misalignment between processes, data, and analytical outputs. When most companies are working with dozens or hundreds of sources, many of which are externally-maintained, and all of them have independent (operational) data models, is it any wonder that we frequently fail to paint a coherent picture of the business?

Proper data modeling is a complex exercise amongst product (when applicable), operations, and data teams. Most businesses have under-invested if not outright ignored this critical element of developing valuable data assets. Instead of making collaborative decisions up front, we make individual decisions on the back end; engineering and analyst time gets eaten up re-inventing the wheel for each asset, report, and dashboard. We as data professionals feel this pain, and by extension, our customers do through poor (or non-existent) time-to-value.

The Analytics Value Proposition

In my experience, many customers struggle with understanding what questions data can answer, and more importantly, how to tie business outcomes to data assets. XY problems¹ are the lingua franca of business teams when communicating with data teams.

To some extent, the lack of core data models exacerbates our customers’ confusion. They bring (often but not always very reasonable) assumptions about what the data must look like to the table, and because they don’t live in the tables and pipelines, they’re frequently very wrong. Even when our customers know what data assets exist, they don’t necessarily have the expertise to understand whether or not they can be applied to their specific use-cases.

The more intractable problem is that we as an industry have a very tenuous grasp on what analytical outputs are valuable, and we have even less insight into how much value any given output produces. We “know” that there’s something valuable about making data available to the business (look at the success of FAANG+), and there’s research suggesting companies that leverage data are more effective than their peers², but the specific mechanisms are poorly understood.

Essentially, our customers never stood a chance at making good requests, at least not consistently. They don’t (nor should they have to) understand the details of their organization’s data assets, and they’re as much in the dark as the experts about whether or not the report they want will provide any real business value. They can ask for reports all day long, and nobody is the wiser as to whether or not that was a good use of the data team’s time. This lack of a feedback loop is frustrating for data teams, and it can leave customers feeling like they didn’t get what they asked for: not because the deliverable was wrong per spec, but because the requested deliverable was never able to generate value in the first place.

Data Is Not Information

At the core of all of this pain is something really simple. Data is not information. Data is not insight. I’ve seen business people lose a tremendous amount of their valuable time trying to do their own data analysis. Staring at a spreadsheet and endlessly pivoting and filtering does not generate value.

Robert Yi has an excellent article on alignment³ that covers this better than I ever could, but I’ll take a crack at adding some thoughts. Data assets are deeply mired in context that only experts can successfully navigate. The numbers are never just the numbers, which is why analytics is deceptively hard. It’s like trying to follow a map, only to realize that sometimes there are roads where none are depicted, and there aren’t any where there should be. “Oh yeah. Old Main was decommissioned two years ago, but we decided to leave it on the map so you would understand why Main Street runs the way it does.” Travelers would be doomed.

And we end up with these sketchy maps because (to mix metaphors) the sausage-making is especially messy. Analytics is a very complex process by which 1) the business takes information (bookings, billings, purchases) 2) that then gets encoded as data by products and enterprise systems (a record on an account object, a record in an inventory table) 3) that ultimately gets extracted to an analytical data store and re-assembled as information. This last step is always more art than science.

It’s a game of telephone that we’re pretty bad at. Semantics get lost along the way. We fail to understand important distinctions in the way different sources’ object models interact when we try to combine them. Humans will adjust to slight deviations in business processes over time, but our analytics systems will not.

But there are things we can do, and I’m optimistic about the future. In particular, I think we should focus on:

Improving our understanding of what analytics activities drive value. Research institutions are a part of this, but we can do a lot of field work in the meantime. In much the same way that product companies conduct experiments to link features to value (churn reduction, expansion), data teams should start making investments in experimentation frameworks. We need to know that our data products are wanted (from customer feedback) and valuable (from metrics).
Improving our data models. By more closely aligning our core data assets with the business, we can have better agility in answering our customers’ questions. Because analytics is frequently time-dependent, being able to deliver answers within days rather than weeks is the difference between providing a lot of value and providing none at all.
Improving our customer relationships. We need to remind ourselves that our customers shouldn’t be expected to understand our incredibly complex systems when asking for help. It’s not their fault that analytics is not a solved space. At the same time, they need to be more engaged with our work and its challenges. We can’t simply roll our eyes and say, “That’s a huge amount of work,” and they can’t tune out when we elaborate: “because the field you’re asking for only exists in one of our four billings systems.”
Improve our operating models. We need to develop systems for identifying valuable data work amidst a flurry of customer requests, communicating requirements effectively, and adding accountability and feedback into delivery. We need to empower our customers to ask the right questions, to know that we’re working on things that matter, and to make sure that what we deliver is providing value, and if not, the ability to distinguish where in the process we failed: simply a “bad” request? Insufficient requirements? Misunderstood requirements? Poor work quality?

Conclusion — There’s No “I” in Team (or in Data)

Especially in these difficult times, it’s easy to point fingers and feel frustrated when we see our counterparts fail. At the end of the day, we’re all in this together. Data teams want our companies to succeed, which means that we need our business partners to succeed. Likewise, they’re invested in our success. We all need to remind ourselves that we’re on the same team, sharing the same challenges and having our own part in overcoming them.

References

[1] Wikipedia contributors, XY problem (2022, December 11), Wikipedia, The Free Encyclopedia

[2] Weill, Peter and Woerner, Stephanie L., Dashboarding Pays Off (January 2022), MIT CISR Research Briefings No. XXII-1

[3] Yi, Robert, Communication is about alignment (December 2022), Win with Data