The world’s leading publication for data science, AI, and ML professionals.

Questions you should ask before you deploy your model

An Introduction to Machine Learning Deployment Strategies

Photo by Tayfun van Zantvoort on Unsplash
Photo by Tayfun van Zantvoort on Unsplash

Finishing a machine learning model that is working well is a great effort, but that’s usually not where the process ends. For a real-world application with an impact, a trained model needs to be deployed and put into production.

Question checklist

Imagine you are an aspiring Data Scientist working for a video streaming website . You are asked to develop a recommender system that shows users videos they might enjoy based on their previous choices. You sit down, do your work and come up with this amazing model. You might think that now it is time to lay back and watch the model the work. But let us take a step back and ask ourselves the following questions:

  • Are there any negative consequences if my model doesn’t reach the required accuracies in production?
  • The model performs well on the test set, but how do I make sure that the model works well in production without potentially losing customers if it doesn’t?
  • Is there already an existing model in production that will be replaced?
  • Is downtime an issue? How do I guarantee that there is no/little downtime during the Deployment of the new model?
  • If there is a bug in the new model, do I have a strategy to roll back an older version?

There is more to think about than you initially thought, right? But what strategies are there to deploy a model?

Photo by Rob Schreckhise on Unsplash
Photo by Rob Schreckhise on Unsplash

Strategies to deploy your machine learning model

Shadow Deployment

Image by Author (Icons made by Freepik from www.flaticon.com)
Image by Author (Icons made by Freepik from www.flaticon.com)

If you are not sure how well your model will perform in production, a way of testing it is to use shadow deployment. In shadow deployment, your new model doesn’t have any real-world impact on your users, it simply runs in parallel with your current system: Your model is making its predictions but does not deliver them to the user but for example to a database or log file. The purpose of shadow deployment is to collect and monitor relevant metrics of your model in production without risking any negative side effects due to faulty predictions of your new model.

Pros:

  • Risk minimization: No negative impact on users due to (unexpected) low model performance.
  • No impact on production e.g. due to bugs in the new model since requests are being mirrored.

Cons:

  • The operational overhead increases since two systems need to be run and monitored in parallel.
  • Slower rollout

Canary Deployment

Image by Author (Icons made by Freepik from www.flaticon.com)
Image by Author (Icons made by Freepik from www.flaticon.com)

Canary deployment is yet another strategy that tries to reduce the risk of new deployments. It can also be seen as the next logical step after a shadow deployment. Instead of rolling out the model to the entire user base, only a certain user percentage is exposed to the new model. A typical starting split is for example 90/10 whereby 90% of the user requests are being handled by the old model and 10% are being exposed to the new model. If the new model contains a bug or the predictions are unsatisfactory, not all users will be affected but only a small subgroup – this minimizes risk.

The idea is again to collect key metrics of the new models’ performance over time. If the model turns out to run robustly, the share of users who are served by the new model can be increased step by step.

Pros:

  • Risk reduction: Reduced negative impact on users due to (unexpected) low model performance.
  • Lower impact on production e.g. due to bugs in the new model since requests are being mirrored.
  • Fast rollback: If the new model fails unexpectedly, you can roll back quickly to the old model, by simply redirecting all requests.
  • No downtime: Both models run in parallel

Cons:

  • Even higher operational overhead than with shadow deployment since two models are effectively running in production.
  • Slower rollout

Blue/Green Deployment

Image by Author (Icons made by Freepik from www.flaticon.com)
Image by Author (Icons made by Freepik from www.flaticon.com)

The structure of the blue/green deployment is the same as with the canary deployment (two models run in parallel), however, the main difference is that not only a share of the requests is being processed by the new model, but 100% of all requests are routed to the new service once it is up and running. The idea is to simply have two environments that are as identical as possible: If something goes wrong with the new model, you can simply reroute all your requests to the old model (rollback). This deployment method is often being preferred if the deployment should happen faster. This of course means that the model was not tested as thoroughly as with the other deployment strategies. Furthermore, a lower number of performance metrics is therefore available for debugging and a potential reiteration of the new model.

Pros:

  • Risk reduction: Reduced negative impact on users due to (unexpected) low model performance
  • Fast rollback: If the new model fails unexpectedly, you can roll back quickly to the old model, by simply redirecting all requests.
  • No downtime: Both models/environments run in parallel
  • Fast rollout

Cons:

  • Even higher operational overhead than with shadow deployment since two models are effectively running in production.
  • Fewer metrics and information are available for debugging the new model.

Further material:


Related Articles