Productizing Machine Learning Models

How to build end-to-end machine learning-powered applications

Tuhin Srivastava
Towards Data Science

--

Building end-to-end applications that utilize machine learning requires an immense amount of engineering work — grappling with Flask apps to serve the model; setting up complex infrastructure (such as Docker and Kubernetes) to scale properly; a completely separate stream of product-engineering work to integrate into existing systems; and front-end development to build new interfaces.

Most machine learning practitioners either have to learn these skills or rely on their engineering partners for help. As a result, it is prohibitively expensive to go zero-to-one on solving real problems with machine learning; and the tooling available to address this set of problems is non-existent or hacky at best.

Productizing machine learning is hard

The steps to productizing an existing model | Image credit: BaseTen

Typically, there are three distinct but interconnected steps towards productizing an existing model:

  1. Serving the models
  2. Writing the application’s business logic and serving it behind an API
  3. Building the user interface that interacts with the above APIs

Today, the first two steps require a combination of DevOps and back-end engineering skills (e.g. “Dockerizing” code, running a Kubernetes cluster if needed, standing up web services…). The last step—building out an interface with which end users can actually interact—requires front-end engineering skills. The range of skills necessary means that feedback loops are almost impossible to establish and that it takes too much time to get machine learning into usable products.

Streamlining the process

Our team experienced this pain first-hand as data scientists and engineers; so, we built BaseTen.

BaseTen is an application builder that allows users to deploy machine learning models, serve APIs, and build front-end UI without having to worry about infrastructure, deployment, or learning React.

In this post, we’ll walk through the key building blocks of BaseTen with a common use case:

As a data scientist, I often want to get my model off my laptop and into the hands of other stakeholders or friends so they can play with the model with dynamic inputs. This is important as it’s difficult to gauge how a model works without having the first-hand experience of querying the model with different inputs to see how outputs will change.

We’ll be using the well-known neural style transfer model for the purpose of this demo, and we’ll end up with a usable application that can be shared with an external audience.

1. Deploying a model

As highlighted above, the first step is to deploy a model. BaseTen allows the deployment of scikit-learn, Tensorflow, and PyTorch models by calling baseten.deploy from the command-line tool (installed using pip). If the model is a little more bespoke, a custom model can be uploaded that just requires a load and predict method (here’s an example). Alternatively, pre-trained models can be deployed from the model zoo.

In the example below, we deploy a deep learning model from TensorFlow Hub that composes one image in the style of another image.

The BaseTen client enables one-line model deployment. | Image credit: BaseTen

Once the model is deployed, go to the model management page to see editable metadata about the model, its deployment status, versions, and how to call it from other services.

2. Wrapping the model with pre-processing and post-processing code and serving it behind an API

Oftentimes, serving models requires more than just calling it as an API. For instance, there may be pre- and/or post-processing steps, or business logic may need to be executed after the model is called.

To do this, users can write Python code in BaseTen and it will be wrapped in an API and served—no need to worry about Kubernetes, Docker, and Flask. The Python environment can be fully customized (e.g. installing pip packages), and a Postgres data store is included.

All of the above is wrapped in a worklet, depicted in BaseTen as an inference graph made up of different types of nodes. Worklets allow users to think in terms of an application’s business logic and overall workflow. After a worklet is created, it can be invoked through an auto-generated API endpoint, a cron job, or by connecting it to a streaming data source.

Serve models with rich APIs using “worklets” | Image credit: BaseTen

In this first worklet example, we’ve added pre-processing code to extract model inputs from a request, called the model, and then transformed the model output into a format that we expect. In this case, the request contains sources for both the style to be transferred and the image that will be modified by the model. The post-processing step simply adds some metadata to the resulting image from the model.

3. Building a UI (when you’re not a front-end engineer)

While training machine learning models ourselves, we’ve often wanted to build stateful user interfaces that either utilized our models or operated off the outputs of our models. In BaseTen, these are called views.

Using the drag-and-drop UI in BaseTen, it’s easy to add and arrange components to build a view, including text inputs, file pickers, forms, tables, PDF viewers, and data visualizations. BaseTen also includes a number of utilities to easily collect and display data, react to user inputs, and call internal and external APIs. Views can link to one another to build multi-page applications. Unlike notebook-based app builders, user state is a first-class concept. This means BaseTen can power complex human-in-the-loop applications. Lastly, views can be publicly shared or embedded and support role-based access controls if needed.

In the simple example below, we’ve built a view to interact with the deployed model and API we built above. End users can provide URLs for the style and content images, see a preview, click a button, and see the resulting image, combining the styles of the first image with the content of the second.

Running the style transfer model in a BaseTen view | Image credit: BaseTen

While this use case is simplistic, in less than five minutes we were able to deploy a live API for the style transfer model and create a user interface for testing the model. Using the same building blocks, early users have built interactive tools for users to give feedback on model predictions, data labeling apps with learning built-in, and human-in-the-loop decisioning flows.

Start building with your own models

Data scientists and machine learning engineers should have the leverage to express the power of their models in the form of real products while avoiding deployment issues and having to hack together APIs and interfaces.

We’re excited to open up free access to BaseTen for early users to kick the tires and help shape our product roadmap. All we’d like from you is your feedback. We can’t wait to see what you’ll build.

--

--

Tuhin is co-founder of Baseten, the ML Application Builder. Tuhin was a Data Scientist at Gumroad and co-founder of Shape (acquired).