Key Metrics for Data Science Team Success

How data science team leaders can measure team performance and demonstrate success for the C-suite

Josh Poduska
Towards Data Science

--

Photo by Kaleidico on Unsplash

As the field of data science continues to grow and mature, many data science leaders struggle when C-suite executives ask them to demonstrate consistent success. A team may have delivered substantial projects, and models delivering tangible results, but data science is still science — It involves experimenting and learning, along with discovering actionable learnings. Some projects won’t demonstrate new insights, and experimental programs may point the way to longer-term impact, but not useful results today. That’s OK at first, but you still need a process to show ongoing growth and refinement for the C-suite.

A recent survey from Wakefield Research found that 71% of data executives say their company leadership expects revenue growth from their investment in data science. Senior leaders don’t just want incremental growth from these programs — 25% of data science leaders say their company leadership expects double-digit growth from data science, adding pressure to deliver quickly.

But a good data science program can take months or years to get a flywheel of innovation cranking away — so how do you show consistent results each quarter to demonstrate you’re not only discovering useful insights today, you’re also building an analytics machine that will add value for many years?

The trick is to move beyond metrics that just show validity and results for certain projects, and deploy metrics that show the overall performance of your complete data science program. You want to show how your team is getting faster, how they are delivering measurable results, and how the team is situated to keep growing. These are the key areas to evaluate for your data science KPIs, to show how your group is adding value to the broader organization.

Demonstrate velocity

One of the best ways to show that you’re not just flying by the seat of your pants (like many fledgling teams), is to track the velocity of your overall performance. When you start a new project, you have objectives, theories, but you don’t know how the project will turn out. Much of data science is a research process, and a team may try 99 experiments before the 100th yields an interesting result…and even “no insight” can be a valid outcome.

But “no insight” shouldn’t be the only output in those situations — you dedicated significant staff resources, and you learned something from that work, knowledge that you can successfully use in the future. You need a system to capture this work and catalog the modeling datasets, features used and initial results, so when you get a similar project down the road, you have a head start in terms of validated data, preliminary models, or effective approaches. Making your process repeatable and trackable is important to building a high-velocity data team so you can move from bespoke “artisan thinking” to reproducible and reusable “modular system” thinking.

The best performing teams move fastest when they build on the past. In terms of metrics, I’ve worked with one data scientist who tracks component reuse as one of the KPIs for their team. People on that team are recognized when they create a widely used component, like a great dataset diagnostic tool, and given credit for their contribution to the overall success.

Glenn Hofmann is the Chief Analytics Officer for New York Life, one of the world’s largest life insurance companies with a more than 175 year track record and one that operates in a heavily regulated industry. Hofmann was an early proponent of a more systemized approach to data science. Over the past five years, the Hofmann-led Center for Data Science and Artificial Intelligence (CDSAi) at New York Life grew from a team of seven people to nearly 50, and invested in infrastructure that captures business-critical results. In that time, the team has created comprehensive models for key business targets (such as customer retention and agent productivity) that can be versioned or expanded rapidly with a new idea from a business partner. CDSAi has also created an environment utilizing Python and R stacks, a data science workbench platform, and a Kubernetes cluster to automate procedures and speed up deployment.

“We’ve eliminated months of recoding work and can bring Python and R code directly to production. Our models can now be accessed from any production platform in the company via an API,” Hofmann notes. By creating an API that others can use in the company, CDSAi and New York Life overall can quickly deploy new projects and support a wide range of business requests.

Leaders need to take a systematic approach to managing data science, and change the process from a complicated moonshot every time to just another lap around the racetrack. By reducing the time to iterate by standardizing templates, reusing datasets and saving software configurations, you can deploy a model in hours — and this helps your team iterate and refine at real-time speed.

Deliver ROI

The importance of data science has elevated the field, but it’s also created high expectations, especially with C-level executives who are unclear on what’s possible with data science.

The Wakefield Research study found that while company leadership may have double-digit revenue expectations for data science, today 82% of companies are making splashy short-term investments without recognizing the ongoing benefits of investing in data science. 46% of those executives say these short-sighted investments happen often or all the time.

If a model fails, the investment and budget may just disappear. 78% of data executives have seen their companies stop a data science initiative or cut back investment if a data model fails, including 26% who say it has happened several times. So you need to make sure and set expectations, and show results that improve over time.

One way to show direct ROI is by using control groups in the production environment. This is going to help you show the value across the company, especially with senior business leaders outside of the data science or IT organizations. One company I know created a ‘global holdout’ group from their customer segmentation and price elasticity models. A year later, they compared revenue from the holdouts to customers guided by the predictive models. By creating a before and after comparison, the team demonstrated more than a $1 billion lift in revenue, results that gave the data science team significant credibility and supported new hires.

One final point on ROI — don’t forget to show people the big picture. Make sure you socialize aggregate portfolio metrics. Even if it’s just an approximation, you want to show the impact of your whole portfolio of projects. You also want stakeholders to be aware how many projects are in-flight, in the pipeline, and on backlog. Executives may only focus on a couple of projects in their line of business or department, so showing them the big picture can be illuminating. This cadence of regular reporting also gives you a chance to demonstrate the collective achievement of the whole data science team along with individual contributions.

Grow your team

Beyond the daily work of finding insights, building a strong data science team is one of the biggest challenges I see in the field today. The Wakefield study found 48% of data executives complain of inadequate data skills among employees, and 44% say they’re not able to hire enough talent to scale data science in the first place.

Recruitment and retention will be ongoing issues, and if you want to build a foundation for significant results, you need a strong, consistent team. You have to show your staff that you take data science seriously and have a rigorous program. Then when reviewing your goals with the CEO, you can also report how many people you’ve hired, your turnover rate, and how quickly new hires are able to contribute meaningful work. By supporting your team with a robust program and clear procedures, the team can focus on their work.

To do this, you need to deploy technology that supports a decentralized team. With ongoing work from home programs, you may not have even met team members in person — even your direct reports! With this level of decentralization, you have to do more to connect new employees to the business, so they have a better sense of what the company needs to do. Make sure new people meet with Line of Business owners, marketing and sales teams and with the IT teams that will help them get work done.

As your team grows, the processes you put in place today will pay off as new team members start working with experts in particular data science tools or fields, building on the knowledge you’ve captured and experiments you’ve conducted. 39% of executives say one of the top obstacles to data science having a great impact at their organization is inconsistent standards and processes. Make it easy to digest your corpus of knowledge, and learn how the rest of the team gets work done so they can mirror this process.

One tactical idea is to hold “lunch-and-learns” for the data science team, and also for the whole company so that everyone learns what your team has done and where it’s going next. New York Life’s CDSAi not only conducts monthly lunch presentations on projects, methodologies, and data usages, the team also hosts an annual data science expo and regular forums featuring external guest speakers who educate and inspire New York Life’s broad data community.

From my experience at Domino Data Lab, I’ve seen many companies do things correctly, and I’ve seen many more who come to us after months of flailing around trying to gain traction. Let’s just say the head of data science programs at those less sophisticated companies don’t usually last long in the role. So I urge data leaders to create a sustainable, long-lasting program.

If you build a program that continues to accelerate with repeatable processes and reusable assets, keeps the C-suite informed and understanding of the big picture, and helps your team feel integrated and effective, then you can build a team that will make a major impact and deliver significant, measurable, results.

--

--