Opinion

Intro
According to [1], Data Science is one of the most sought after jobs on the job market. But is this still the case? Or is there already a more desirable one?
There is! Machine learning engineering is overtaking data science in the job market.
In this article, I want to shed light on why machine Learning engineering is overtaking data science in my opinion and how you can start learning it.
But let’s first start with understanding the difference between both job roles.
Machine Learning Engineer vs. Data Scientist
The quote from a Snowflake article summarizes the differences quite well [2]:
Machine learning engineers are further down the line than data scientists within the same project or company. A data scientist, quite simply, will analyze data and glean insights from the data. A machine learning engineer will focus on writing code and deploying machine learning products.
We can also take a look into the lifecycle of a data science project to understand the differences better:
![Figure 1: ML project lifecycle [3] (Image by author).](https://towardsdatascience.com/wp-content/uploads/2022/09/08fWtSzod7IdHR6t0.png)
So basically, a data scientist develops a model, trains and evaluates it. The Machine Learning engineer then takes that model, deploys it into production and ensures that the model is maintained. So the machine learning engineer puts the trained model into a product so that revenue can be generated from the model.
But aren’t both jobs equally important? Yes they are. But data scientists were already hired massively from companies, as they were mostly in the modeling and exploration phase. And machine learning engineers are now heavily required, as the companies now need to put these models into production for creating value out of them.
According to an article of Venture Beat [4], "87% of data science projects never make it into production". And this is due to the lack of hired machine learning engineers that know how to put models into production. This mismatch clearly shows that companies are now focusing more (at least they should) on hiring machine learning engineers, being able to put the models into production.
We can also see the difference when checking the open job postings on Glassdoor. For California in the US, there are currently 1809 data scientist job postings in comparison to 3345 machine learning engineer job postings. So there are almost twice as many open positions for machine learning engineers!
But why can’t data scientist not simply also learn how to put models into production? Because the data scientist is focused on ML code, which typically is only a very small portion of the complete ML infrastructure (Figure 2). And the data scientist should also only focus on that small portion. It would simply be too complex to focus on the ML code and the infrastructure for deployment, monitoring, …
It is therefore important to have a data scientist and a machine learning engineer in your team to create the best value out of your data.
![Figure 2: ML code in comparison to the complete ML infrastructure (Image by [5]).](https://towardsdatascience.com/wp-content/uploads/2022/09/06xCXNd6BR97W7xci.jpeg)
Okay, so now we know that machine learning engineers are currently more in demand on the labor market. But what skills are required for being a machine learning engineer? What do you need to learn to become a machine learning engineer?
Path to Become a Machine Learning Engineer
In this section, I want to focus on the required skills for becoming a machine learning engineer and the probably best tools to learn. On top of that, I want to provide you links to online courses that I have taken on my journey to becoming a machine learning engineer.
DISCLAIMER: I only provide links to courses that I have participated in myself. The links I provide are not affiliate links, so I don’t get any money from sharing them. I just want to share them with you because they have really helped me on my learning journey!
Most valuable Skills
So, according to an article from Udacity [6], these are the most valuable skills for becoming a machine learning engineer:
- Computer Science Fundamentals and Programming: data structures (stacks, queues, …), algorithms (searching, sorting, …), computability and complexity and computer architecture (memory, cache, bandwidth, …)
- Probability and Statistics: probability, Bayes rule, statistical measures (median, mean, variance, …), distributions (uniform, normal, binomial, …) and analysis methods (ANOVA, hypothesis testing, …)
- Data Modeling and Evaluation: finding useful patterns (correlations, clusters, …) and predicting properties of unseen data points (classification, regression, anomaly detection, …), continuously evaluating model performance with correct performance metric (accuracy, f1-score, …)
- Applying Machine Learning Algorithms and Libraries: choosing correct model for underlying problem (decision tree, nearest neighbor, neural network, ensemble of multiple models, …), learning procedure to train model (linear regression, gradient boosting, …), understand influence of hyperparameters, experience with different ML libraries (Tensorflow, Scikit-learn, PyTorch, …)
- Software Engineering and System Design: understand different system components (REST APIs, databases, queries, …), build interfaces for ML component
Tools to Learn
Now let’s move on to the tools that I think are essential to learn:
- Python: I think this one is clear. Python is still the number one programming language in the field of machine learning [7], and it is also easy to learn.
- Linux: As a machine learning engineer will work a lot with infrastructure topics, being able to work on Linux is really important.
- Cloud: More and more applications are moving to the cloud. That means that you as a machine learning engineer will probably also deploy the models to a cloud environment. Therefore, I recommend learning to work with at least one of the popular cloud providers (GCP, Azure, AWS). I am currently enrolled in the AWS developer certificate course on Udemy that I can really recommend!
- Docker, Kubernetes: In my opinion, these two tools are a must learn for every machine learning engineer! They are so powerful for easily deploying models into production and creating complete architectures for your applications. I took the Docker and Kubernetes complete guide on Udemy and learned a lot throughout this course!
Other Useful Online Courses
So now that you know what skills are required and what tools to learn, I also want to show you some other helpful online courses that I think can help you on your journey to becoming a machine learning engineer (at least they helped me):
- Deep Learning Specialization by Andrew Ng: This course focuses on Deep Learning and how to train models in the field of image classification and many more. Andrew is great in explaining the theory. But you are also directly applying the theory in hands on lessons, which is great in terms of the skills needed to apply machine learning algorithms and libraries.
- Machine Learning Nanodegree by Udacity: This so called Nanodegree of Udacity focuses on training ML models and putting them into production, mainly using AWS SageMaker and more. **** You can also check out my Medium article where I write about the Capstone project that I did for passing this course. NOTE: Udacity replaced my course with a newer version of that course. But I think this new version still makes a lot of sense to participate in.
- IBM Machine Learning Professional Certificate: This course on Coursera focuses on every aspect of machine learning, with a lot of hands-on. You will learn about supervised and unsupervised machine learning, deep learning, reinforcement learning and many more. At the end of each course you have to build your own Capstone project where you also have to create a report describing your application and so on.
Conclusion
You have now learned that becoming a machine learning engineer is more desirable than becoming a data scientist. You also now know the skills and tools you need to learn to become a machine learning engineer.
Therefore: Go and get your hands dirty! Learn these tools, take some online courses, and land your first machine learning engineering job.
Just one more thing that I want to say: Always get your hands dirty! Make as many hands-on ML projects as you can. And don’t forget to take your trained models and put them into production, as you want to become a machine learning engineer.
You can also read my articles about a Deep Learning project, where I trained an ML model and put that into production.
In this article, I explain the underlying problem and how I trained the ML model. I then package the trained model into a Docker container and create an easy webpage using Flask.
In this article, I then deploy the Flask application into AWS so that everyone could access my application.
Thank you for reading my article to the end! I hope you enjoyed this article. If you want to read more articles like this in the future, follow me to stay updated.
Contact
References
[1] Thomas H. Davenport and DJ Patil, Data Scientist: The Sexiest Job of the 21st Century (2012), Harvard Business Review
[2] Snowflake, MACHINE LEARNING ENGINEER VS. DATA SCIENTIST
[3] Sundeep Teki, ML Engineer vs. Data Scientist (2022), Neptune AI Blog
[4] VB Staff, Why do 87% of data science projects never make it into production? (2019), VentureBeat
[5] Rashid Kazmi, Machine Learning in Production (MLOps) (2022), Towards Data Science
[6] Arpan Chakraborty, 5 Skills You Need to Become a Machine Learning Engineer (2016), Udacity
[7] Sakshi Gupta, What Is the Best Language for Machine Learning? (2021), Springboard