
Introduction
As a machine learning engineer, I’ve noticed that the exact responsibilities across data titles has been constantly changing over the last few years. This has meant any companies are no longer sure about the difference between all the data roles, which can lead to the wrong fit for the role, particularly at startups or early stage companies. In this article we’ll explore why, unless you have an exact Data Science or machine learning project in mind, it may not be correct to blanket hire data scientists early on.
So you’ve decided that you need to be data-driven, if only to the unblock the multiple functions across the business asking questions which can only be answered by analysts (or a whole team of them).
This is the point that the business decides to hire a bunch of data scientists to solve all the business needs. After all, data scientists are the hottest data commodities on the market, able to solve problems using statistics and machine learning. They’ll be able to answer all the business questions and bring the company to be data-driven…right?
Yes and no.
The Data Roadmap
The key issue with Hiring data scientists to solve all of your data problems is that "data" encompasses a very broad range of skills, from querying the data, to building pipelines and building machine learning models. It is unlikely that a single data scientist has all of these skills, and if they do they almost certainly won’t have had the opportunity to develop the skills as deeply as a specialist (unless of course you’ve found a mythical data science unicorn).
Unless you have an AI centric product, for most companies the ability to drive business decisions at a fast pace by pulling numbers and answering cause and effect questions is much more important than spending months developing a single machine learning model. Therefore, a typical high level data-driven roadmap at the organisational level can be broken down into 3 main stages:
- Dashboarding and KPIs (reactive decision making) – enabling business stakeholders to make decisions based on data
- Strategy recommendations (proactive decision making) – driving value by making business recommendations based on proactive analysis of data
- Machine learning and automation – automates processes and proactively make business recommendations based on model predictions
Here’s the typical Job Titles required for each stage (in 2021), notice that data science/machine learning skills are only required in stage 3:
- Data engineer (data pipelines), Analytics engineer (transformed tables), Data analyst (dashboarding)
- Data analyst /Data scientist (working with stakeholders)
- Data scientist/Machine learning engineer (machine learning/statistical model development)
The Right Skillset

So will just hiring data scientists be able to answer all the business questions and bring the company to be data-driven?
To a degree, many data scientists do have the ability to cover multiple roles in their toolbox. Data cleaning techniques, table transformations, and the analytics know-how to perform both strategy and dashboarding are just some examples of this. However, they may not have as much experience orchestrating data pipelines as a data engineer, designing denormalised tables as an analytics engineer, or working with demanding stakeholders as a data analyst. It’s all about finding the right skillset for what the role requires, rather than finding someone who’ll be able to cover all roles (but won’t be able to be in 3 places at once).
That’s not to say don’t hire data scientists at all before stage 2 of a roadmap, many of them are immensely talented and will be able to drive projects forwards with a lot of value. Just that if you hire a data scientist to solely do the job of a data engineer or a data analyst you’ll probably also find that they become dissatisfied quite quickly. They may not want to work on ad-hoc SQL queries or improving data reliability. Anecdotally, the data scientists I know like to flex their statistical and machine learning capabilities in their work, as it’s what they enjoy most in the field.
Conclusion
The rise of data science and constant flux in data titles over the last few years has made companies keen to hire data scientists to cover all of their data needs. Companies (particularly those at an earlier stage) will find greater value defining what stage of the data-driven roadmap they currently sit at, then digging a deeper into the different data roles to figure out what skills are needed. Don’t hire a data scientist to take on the role of a data engineer when data science and machine learning projects are 2 years down the line.