A machine learning process is made up of several steps which are cyclical in nature. The more of these steps an organisation can automate through MLOps the more mature the machine learning process is.

MLOps is about the application of DevOps philosophy into a Machine Learning system (to read more about these two practices please checkout my article: https://towardsdatascience.com/mlops-vs-devops-5c9a4d5a60ba). The machine learning process to bring a machine learning model to production involves several steps. The level of automation of the steps that make up the machine learning process determines the maturity of the machine learning process. Generally, the more automated the process, the higher the velocity of training new models given new input/model implementations.
Steps of a machine learning process
- Data extraction: This step involves the integration of data used for the machine learning task from various data sources. The aim of this step is to select which data to be used in the machine learning task.
-
Data analysis: In this step an exploratory data analysis (EDA) is performed to better understand the data extracted. The aim of this step is two fold: 1: Understand the data, the schema and the distributions of the data to be used as input/label for the model. 2: Identify any data preparation steps and feature engineering that will be required in order to conduct the machine learning task.
- Data preparation: This step involves data cleaning (handling missing data, removing nonsense data etc), data splitting into training/validation/test, and feature engineering by creating new features to hopefully improve the model’s predictive power. The output of this step is the training, validation and test data with all the data cleaned and new features added in the correct format that can be passed into a model.
- Model training: Different algorithms are implemented using the output of the data preparation step to train various machine learning models. Typically I do hyper parameter tuning in this step to explore the hyper parameter space to identify an optimal model. The output of this model is the model artifact (model architecture and model weights) of the best model found during this step.
- Model evaluation: The best model found in the previous step is evaluated on the test data curated in the data preparation step. Prior to this step a evaluation metric or set of evaluation metric must be decided to determine how to evaluate the model.
- Model validation: The chosen model must be satisfactory enough to be deemed adequate to be deployed into production. To determine this generally the model must perform better then a baseline. This may be testing if the model prediction is better then the current process it is trying to improve.
- Model serving: The model chosen is deployed to an appropriate environment to enable applications/processes to consume the model’s predictions. To read more details about the various model deployment options please refer to my article: https://towardsdatascience.com/machine-learning-model-deployment-options-47c1f3d77626. Depending on the business requirements the model can be deployed to one of the following:
- Online predictions using a REST API where applications/processes provide input data to the model via the endpoint and receive back the model’s predictions.
- Model is embedded to an edge device and the model’s predictions are calculated on the edge.
- A batch prediction process is utilised where periodically or on certain events compute resources are created and the model is deployed on the compute resources to process predictions on the input data.
- Model monitoring: The deployed model’s performance is monitored to determine if the machine learning process needs to go through another iteration. Usually there is a evaluation metric threshold set and if the deployed model’s evaluation metric deteriorates past this threshold it may be time to go through another iteration.
Once it is determined that the machine learning process needs to go through another iteration all the steps are repeated again, most likely with adjustments or enhancements to individual steps, to adjust for changes in the data. With this repetitive and cyclical process, it is quite evident that automating these steps will improve efficiency, consistency and scalability.
Generally the more of these steps an organisation is able to automate the more mature the machine learning process is. The benefit of being able to automate most or all of these steps is that an organisation can conduct many experiments efficiently and deploy the validated model into production faster. Also, by removing manual processes it reduces the chances of potential failures caused by human error.
There are 3 levels of MLOps with varying levels of automation of the steps that make up a machine learning process. These levels are distinguishable by their characteristics and challenges. I will go into more detail the characteristics and challenges of these 3 levels in future articles.