Your data science project needs a win condition.

Walking away from a project is hard, but you need to know when to do it.

Jeremie Harris
Towards Data Science
4 min readMay 26, 2020

--

Winston Churchill, presumably after completing his first data science project. Source: https://www.flickr.com/photos/levanrami/45544041992

One of the coolest things about working at the SharpestMinds data science mentorship program is that we get the chance to see what projects people build, and which ones get them hired.

And the single most important thing that we’ve learned over and over again is that the very best projects aren’t projects at all: they’re products.

In this post, I want to explain what it means to build a product, and how building products can help you figure out how good your project has to be, before it’s ready to ship (and impressive to employers!).

What’s a product?

A data science product is a project with a use case. It’s something you would use yourself, ideally.

It can be a recommender system, a dashboard, or a classifier. But whatever it ends up being, it should be useful to someone — and the more concrete your idea of who that someone is, the better.

Focusing on usefulness and building products makes it possible to define a “win” condition for your project: the level of model performance (accuracy, recall, AUC score, etc) that makes your model “good enough for practical purposes”.

Win conditions are the single most commonly missing ingredient in most data science projects. Which is really bad, because they’re one of the most important things companies need you to be able to figure out.

As a data scientist, you’re not going to be downloading datasets, cleaning them and tuning a model against a loss function that some manager gave you from on high. More often than not, you’ll have to come up with that loss function, and figure out how good it has to get before you have a product you can ship.

As a result, companies — and hiring managers in particular — spend an awful lot of their time during interviews looking for hints that you’re able to figure out win conditions. And if your project doesn’t include one, you’re basically removing the best opportunity you have to tick that box.

Here’s how to choose one.

Setting your win condition

Your win condition will have two components: a metric you want to optimize, and a threshold beyond which you consider that metric to be optimized.

But you can only figure those things out if you have a clear use case in mind for your project.

For example, suppose you’re building a fashion recommender system to help people find matching shirts and pants. Does it matter if your performance metric is 1% higher? Will users be able to tell the difference between 0.99 AUC and 0.98 AUC?

Probably not. So don’t make your win condition 0.99 AUC. The time you’d have to spend obsessing over your model to squeeze out that last fraction of a percentage point of performance could be better invested doing other things, like collecting more data, improving your visualizations, or solving a completely different problem.

But suppose instead that you’re building a cancer diagnosis tool. That 1% could save lives. So every percentage point of performance is incredibly valuable, and it’s definitely worth investing heavily in tuning your model.

Win conditions aren’t always just simple metrics. If your project is to build a fashion recommender system for online shoppers, then the finished product isn’t an algorithm — it’s an app. If your project is to build a dashboard that makes it easy to spot certain trends, then your win condition is the deployment of that dashboard.

Here’s a great example of how to do this, straight from a SharpestMinds mentee’s project: it’s an app that lets two friends with different tastes in music compromise on what they want to listen to by factoring in both of their preferences. Because it’s a fun, consumer-facing app, the interface and structure of the app are as important as the model, so a more balanced time investment (and win condition) make more sense. Besides, taste in music is extremely subjective, and it’s unlikely that a user would notice even a 5% improvement in the value of the model loss.

Explaining these kinds of trade-offs to a hiring manager is a great way to show that your focus is where it needs to be: not on Kaggle-like model jiujitsu, but on delivering real value for real people, and real use cases.

As a parting note: the time to define your win condition is before you even begin to work on your project, to avoid the risk of spiraling into a vortex of never-ending optimizations and tweaks that cause you to waste your time doing things that don’t deliver real value.

Heading into a project without a clear win condition is a recipe for wasting your time and your company’s money solving problems that don’t need to be solved. Most companies are understandably worried about this risk — and one of the best ways to stand out as a job seeker is to address it head-on by showing that you can connect business problems to data science problems.

And that all starts with win conditions.

You can follow me on Twitter at @jeremiecharris

--

--

Co-founder of Gladstone AI 🤖 an AI safety company. Author of Quantum Mechanics Made Me Do It (preorder: shorturl.at/jtMN0).