Deploying Machine Learning Models as Data, not Code — A better match? MLOps using omega|ml

Patrick Senti
Towards Data Science
7 min readDec 15, 2019

--

The data science community is on a mission to find the optimal approach to deploying machine learning solutions. My open source framework for MLOps, omega|ml implements a novel approach by deploying models as data, increasing speed and flexibility while reducing the complexity of the tool chain.

Photo by Stephen Dawson on Unsplash

DevOps for machine learning: Not a full match

When it comes to deploying machine learning models, tools such as mlflow and kubeflow promote the use of DevOps principles, that is to package models as if they were code, deploying them as part of a software release. On first sight, this approach provides several desired properties:

  • models are versioned, along with other code
  • model releases are built, packaged and deployed as a unit
  • deployments are reproducible, fallbacks are possible

However, when we apply this DevOps approach to machine learning, our deployments fall short on other key properties that we need to run effective machine learning solutions.

In particular, in any collaborate environment and even more importantly in productive ML systems we want to:

  • share and deploy new models immediately
  • re-train models automatically, in production
  • capture run-time model inputs for later use in quality assurance
  • run multiple model versions (rendezvous architecture)

Key observation: Models are essentially data

A machine learning model essentially consists of a given algorithm + weights + hyper parameters. Weights and parameters are data, not code, while the algorithm of any particular model is a reference to a static library. From this perspective, treating models as data is a more natural fit than creating packaging semantics around weights + hyper parameters.

One approach to achieve this is to build additional tooling and infrastructure, obtaining these properties on top of a more traditional DevOps CI/CD process. However, I propose to rethink the problem and treat models as data, not code.

With omega|ml, deploying models as data is straight forward

What does it mean to treat models as data, not code? In omega|ml a model can be deployed directly from within Python code:

clf = LogisticRegression() # scikit learn model
om.models.put(clf, 'mymodel') # store the model

This stores the model clfas a joblib-pickled file and creates a Metadata entry in omega|ml’s built-in analytics store. It also creates a REST API endpoint at /api/v1/model/mymodelso that the model can be accessed from any application, in any programming language. Deployment is instant, there is no delay and no packaging or CI process required, nor is there any explicit deployment step: no containers need to be built, deployed and started.

Fit models locally or remotely in-cluster

“Wait”, you say, “we have not fitted the model yet, so how is deploying the model useful?”. omega|ml can deploy both unfitted and fitted models. In case of an unfitted model we can train the model remotely, that is using omega|ml’s compute cluster:

om.runtime.model('mymodel').fit(X, Y)

This will select a compute node, load the X and Y data into memory and run the clf.fit(X, Y) method. Upon completion it will store the now fitted model under the same name, effectively replacing the previously unfitted model with the fitted version. We can re-run the fitmethod in-cluster any time we want. Any other model methods can also be run such as partial_fit, score, gridsearch etc.

The X,Y parameters passed to the fitmethod are either in-memory data objects (e.g. Pandas DataFrames, Series or Numpy ndarrays), or named objects that have previously been saved, for example:

# save data, a pd.DataFrame
om.datasets.put(data, 'mydata')
# assuming data has several feature columns with a target variable om.runtime.model('mymodel').fit('mydata[^Y]', 'mydata[Y]')

Separating development and production

Separating development and production is important in any software system that undergoes active development while being used productively. The same is true for machine learning systems. omega|ml supports environment separation by the concept of object promotion:

# om_prod is the production instance
om.models.promote('mymodel', om_prod.models)

The production instance can be either a separate deployment of omega|ml in a production cluster, or it can leverage the built-in bucket functionality:

# here the production-bucket is a logical namespace 
om_prod = om['production-bucket']

How does it work?

Any instance of omega|ml provides and runs all the components a data science team needs, be it in development or in production:

  • Multi-user Jupyter Notebooks: Data scientists can work on a shared Jupyter host, or work on user-specific instances. Notebooks can easily be scheduled from a single command and using human-like schedules, e.g. "daily at 6 AM”.
  • Analytics Store: The built-in analytics store provides for structured and unstructured storage (backed by MongoDB), including a scalable distributed file system. Any storage backend can be added using omega|ml’s plugin mechanism. The built-in analytics store means teams can get to work instantly and productively without time-consuming and ineffective copying of files from/to their laptops or from/to some externally hosted object storage. It also means companies can go back to organizing data used for machine learning centrally and securely, reducing risks of a data breach due to inadvertent copying to potentially unsafe locations (such as a laptop’s hard disk).
  • Compute Cluster: While today’s desktop and laptop computers have ample resources even for machine learning, leveraging the cloud is useful in many cases, if not a necessity. For example, dataset processing and model training can be run asynchronously, overnight, or using GPUs only available from the cloud provider — without the need to be constantly connected or keep the local workstation running. omega|ml is made for the cloud and comes with a powerful and extensible runtime architecture built-in. It supports training models and processing large, out-of-core datasets efficiently, attaching GPU or other resources is a matter of configuration.

In addition omega|ml provides various APIs that match different requirements.

  • a REST API to models, datasets and lambda-style/serverless scripts (to deploy and run arbitrary Python modules)
  • an extensible Python API to build backend applications and web applications leveraging e.g. Flask, Django or Plotly Dash.
  • a command line client (cli) to support any development and scripting environment. The cli also enables working with local IDEs such as Pycharm or a local instance of Jupyter Notebook, and deploying notebooks and automatically built PIP-packages to the omega|ml runtime.

Note that deployment works the same way for any model that omega|ml supports, e.g. any scikit-learn model (including Pipelines), Tensorflow and Keras models. Further omega|ml provides an easy to extend plugin mechanism to support any other machine learning framework in exactly the same way.

One approach, multiple benefits

Deploying models as data means a data science team can achieve several things at once:

  • collaboration: once the model has been saved to the data store, other team members can easily retrieve the model and work with it locally or remotely:
# retrieve the model locally 
clf = om.models.get('mymodel')
=> clf is the last saved LogisticModel
# score the model remotely, using some new in-memory data in X, Y
om.runtime.model('mymodel').score(X, Y)
# run a gridsearch remotely, saving the best model
om.runtime.model('mymodel').gridsearch(X, Y, parameters=...)
  • in-production re-training: since the model is stored as an object in a database retraining and deploying in production is as simple as running a job that re-runs the fit method on the model:
# assuming the new data is in saved objects newX, newY
om.runtime.model('mymodel').fit('newX', 'newY')
  • capture data used for prediction: any data passed into the model through the REST API is automatically saved before prediction (this feature has room for improvement, a respective plugin is being worked on). Alternatively, we can store the data after we predicted on it either through the /datasetsREST API, or via the Python API
# store the input and output data after prediction from Python
om.datasets.put(inout, 'predicted-upondata')
# use the REST API
PUT /api/v1/datasets/predicted-upondata/ (+ JSON body
  • run-multiple model versions: considering models as code it is easy to deploy multiple model versions using a naming convention, then running the models in parallel and by some aggregation choose the best result.
# store
om.models.put(model1, 'mymodel-1')
om.models.put(model2, 'mymodel-2')
# run in parallel, note each call is asynchronous
y1 = om.runtime.model('mymodel-1').predict(X)
y2 = om.runtime.model('mymodel-2').predict(X)
# aggregate y1 or y2 in some way once results are available
...

Note this is just one way omega|ml supports multiple model versions, other ways include virtual object handlers (to combine multiple models under the same name) or using a script/lambda module (to run arbitrary code using om.runtime.job('script-name').run())

Conclusion

Deploying models using DevOps principles is an obvious choice when considering models as code. However, it falls short on several desired properties like instant deployment, collaboration, in-production retraining, runtime data collection for quality assurance and running multiple model versions in a rendevouz style. Therefore applying a DevOps approach to machine learning — and in fact, data science at large — calls for additional investment & tooling to achieve these properties.

omega|ml considers models as data, not code. This achieves all the desired properties for developing and operating data science and machine learning systems, from lab to production. Namely omega|ml helps a data science team to leverage cloud resources efficiently, to collaborate on machine learning models as well as data, and to operationalize their ML solutions easily. Indeed, with omega|ml a single line of code is all it takes to deploy a machine learning model.

Learn more at http://get.omegaml.io

--

--

Senior Data Scientist, Machine Learning Engineer, Fullstack Software Engineer. Founder at omega|ml, productizing AI/ML the easy way. https://omegaml.io