MLflow plugins allow integration to any custom-platform. This is a great opportunity to land from experiment tracking to model production. Today let’s see how to implement a deployment plugin in MLflow

The previous articles about MLflow:
- Scale-up your models development with MLflow
- Improve your MLflow experiment, keeping track of historical metrics
Table of Contents
What’s an MLflow plugin?
MLflow has been built as a framework-agnostic tool for Machine Learning, which could cover the entire ML process, from data exploration to model developing, tuning and deployment. The MLflow Python API is so versatile that allow developers to fully integrate MLflow for different ML frameworks and backends. Plugins are one of these integrations that MLflow offers. Plugins allow users to have an additional MLflow-compatible component that could, for example, save artefacts on specific services (e.g. on a dedicated database). Moreover, plugins could be developed to accept third-party authentications or to deploy any model to custom platforms.
Here you can find the MLflow plugin excellent documentation: https://www.mlflow.org/docs/latest/plugins.html while here: https://www.mlflow.org/docs/latest/plugins.html#deployment-plugins you can find a list of MLflow deployment plugins.
In this article, we’ll see how to implement a plugin that could land a model from development to production on GCP AI-platform. In particular, we’ll understand what are the elements needed to create a new plugin, and what functions we need for deploying a custom model to the AI-platform. This will be a skeleton that can be further developed, to get to a more automatic and autonomous component in our ML Ops framework.
MLflow AI-platform plugin
What do we need to create a new MLflow plugin?
Eventually, an Mlflow plugin is a custom Python package that has to satisfy the following constraints on the main deployment object:
- it must be a subclass of
mlflow.deployments.BaseDeploymentClient
; - it has to have
run_local
andtarget_help
functions. The former allows local testing – we’ll not implement this function in this tutorial – while the latter can return useful information; - it has to have the following methods (inherited by
BaseDeploymentClient
): 1)create_deployment
, the main method for defining the deployment process, 2)update_deployment
for updating deployment info, 3)delete_deployment
to delete deployed models, 4)list_deployments
to list all the deployed model, 5)get_deplyoment
to retrieve info about a specific deployed model 6)predict
the main method to return predictions directly from MLflow.
In particular, our target is GCP AI-platform. On the AI-platform there’s a path to follow for deploying models as endpoints: https://cloud.google.com/ai-platform/prediction/docs/deploying-models. Thus, we can structure this path in order to be easily implemented in our deployment method create_deployment
, as fig.1 shows:
- First, the code retrieves all the needed configuration settings (e.g.
production_bucket
,run_id
, artefacts and so on) - Then a "deployment bundle" is created. The code looks for model artefacts location, production uri, it creates the installing packages for our model as well as the model’s endpoint API Python code –
create_bundle()
- At this point the bundle can be pushed to a specific production bucket –
upload_bundle()
- the AI-platform Python API starts creating the model container on the platform –
create_model()
- A json request is sent to the AI-platform to proceed with the model deployment –
update_deployment()

Additionally, we’ll have to create delete_deployment
, get_deployment
and list_deployments
methods, always using the AI-platform API, to satisfy the MLflow Deployment plugin requirements.
Structure of the plugin folders and model templated files
Firstly, the plugin package files and folders’ structure can be defined as below:
mlflow_ai_plugin/
mlflow_ai_plugin/
__init__.py
DeploymentTrackingInterface.py
model_setup.py
predictor_template.py
requirements.txt
setup.py
The main deployment class is contained in the entrypoint DeploymentTrackingInterface.py
. The predictor_template.py
is a templated version for the model’s endpoint API:
The API satisfies the AI-platform requirements. Firstly, the class constructor __init__
reads the input model and a preprocessor as well – for the sake of simplicity here we commented out the preprocessor’s bits. The method predict
returns the model’s prediction. An input dataset is converted to a NumPy array as np.array(instances)
and probabilities are computed through self._model.predict(inputs).
Finally, the method from_path
allows to spin up the model in the AI-platform using MLflow. Any model’s artefact files (e.g. pickle, joblib, h5
) can be read through MLflow as mlflow.pyfun.load_model()
. Here MODEL_ARTIFACT_URI
is a templated keyword, which can be substituted by the model artefacts uri when the deployment script is running, through the main method create_bundle()
, as we’ll see later.
To allow a model’s API predictor_template.py
to work, a setup.py
file is needed, so AI-platform knows which packages are needed to install the model. model_setup.py
is a templated version for a model’s setup file. In this example, the setup file is fairly easy, but it can be tuned, for example, by data scientists at deployment time:
Again, in the method create_bundle
the templated keyword MODEL_VERSION
is substituted with the current model version. The file will create a package named deploy_from_script
whose script is predictor.py
– which is created from predictor_template.py
– and whose dependencies are installed through install_requires
.
Deployment Tracking Interface
DeploymentTrackingInterface.py
is the core of our plugin, where all the AI-platform steps are triggered and controlled.
Config and constructor
The very first step is to define the job constant variables, the configurations and the object constructor.
Fig.4 shows the import statements as well as the constant variables. Among the imports, it is worth remembering that we need to import from mlflow.deployments import BaseDeploymentClient
and import mlflow_ai_plugin
, which is the package itself, so it will be possible to retrieve paths for the templated files predictor_template.py
and model_setup.py
. Finally, we can define here the two functions required by MLflow plugins, which are run_local
and target_help
– which will not be implemented here.
For AI-platform and process configurations we can define a class Config
that reads are input a dictionary that contains all the following variables:
project_id
This is the GCP project we are working onproduction_bucket
the name of the bucket for productionproduction_route
the path within theproduction_bucket
where we want to store all the model’s artefactsrun_id
the MLflow model’s run id, so we can select a specific modeltracking_uri
is MLflow UI uri, e.g.http://localhost:5000
prediction_script
defines the path topredictor_template.py
which will be read bycreate_bundle()
to accommodate the model’s endpoint informationtmp_dir
defines a temporary path, where the plugin can create filessetup_file
is themodel_setup.py
file path, which will be read bycreate_bundle()
model_name
the name of the model in productionversion_name
the version of the model
Deployment Protocol
Fig.6 shows the deployment protocol, which defines all the methods necessary to implement a plugin in MLflow as well as all the methods for deploying a model to the AI-platform. The constructor __init__
defines the configurations settings from Config
class – defined above.
Create bundle method
Fig.7 shows the code steps for creating the model "bundle", namely the models’ files.
Firstly, retrieve_model_info()
is called (fig.8). Here, through the MlflowClient
the model files are retrieved from its run_id
and converted to a dictionary run_info['info']['artifact_uri']
.
Following, the model’s endpoint Python predictor.py
is adjusted accordingly, so that MODEL_ARTIFACT_URI
will be self.model_uri_production
path (e.g. gs://bucket/model/model.joblib
) and the template_predictor.py
file will be saved as self.predictor_path
(e.g. predictor.py
). This substitution is done using Ubuntu’s sed
command
Finally, the model’s setup.py
is created, again substituting MODEL_VERSION
via sed
and saving the file locally, so AI-platform can read it. Once the model’s files are available, create_bundle
proceeds to package in gztar
the installation sdist
through the command python self.setup_path sdist --formats=gztar
and it returns the gztar
file path to the create_deployment
Upload bundle method
At this point, the installation gztar
file, as well as the model artefacts (e.g. the model binary file model.joblib
), are uploaded to the production path., as shown in fig.9.
Create model method
Following, once all the files are available to the AI-platform, we need to create an AI-platform model through create_model
. Initially, a json request is sent to the AI-platform, specifying the model name self.settings['model_name']
, whether we want an endpoint prediction onlinePredictionLogging
and the working region (e.g. europe-west1
). Then, all the models present in the AI-platform are listed AI_PLATFORM.projects().models().list
and a for-loop checks whether the given model name already exists. If the model does not exist, AI_PLATFORM
creates this new model as AI_PLATFORM.projects().models().create().execute()
Update deployment method
The update_deployment
method is the real trigger for deployment and it is a requirement from MLflow plugins. This method calls update_source
method , as shown in fig.11.
update_source
sends a json request to the AI-platform. The json request contains the name of the model, the deploymentUri
which is the location of the model’s artefacts, createTime
, machineType
, packageUris
which is the location of the gztar
model’s installation file, pythonVersion
, runtimeVersion
for AI-platform and the predictorClass
, namely the class for prediction from the model’s endpoint file predictor.py
Through AI_PLATFORM.projects().models().versions().create
the new endpoint is created
Delete deployment method
Now, let’s have a look at the accessory plugin methods. delete_deployment
deletes an existing model, given its GCP uri (fig.12).
The model’s version is restructured, to be consistent with AI-platform, and the model’s AI-platform path is created parent = self.settings['project_id'] + f"/models/{self.settings['model_name']}"
. This model is then deleted through AI_PLATFORM.projects().models().versions().delete(name=body)
Get and list deployments methods
Fig.13 shows get_deployment
and list_deployment
methods. The former return model’s info from a model’s name through AI_PLATFORM.projects().models().version().get(name=body)
. The latter, enlist all the deployed models through AI_PLATFORM.projects().models().version().list(parent=parent)
where parent
is the model’s uri within AI-platform.
Install MLflow Deployment plugin
Let’s turn our attention now to the MLflow Deployment plugin setup.py
file. This setup will install the plugin within your MLflow installation. It is worth noticing that we need to specify the plugin entry point: entry_points={"mlflow.deployments":"aiplatform=mlflow_ai_plugin.DeploymentTrackingInterface"}
which calls the DeploymentTrackingInterface
code. aiplatform
is read as a target from MLflow, so that we can call our MLflow AI-platform plugin via a command like mlflow deployments create -t aiplatform ...
Examples
Finally, let’s see some examples of how to use the deployment plugin. Firstly, we are going to run the deployment plugin through Python. Firstly, as we saw in part-2 of our MLflow series, let’s run an MLflow experiment, where we are training a simple Neural Network model, as depicted in fig.15
Once the model’s training has been saved in MLflow, we can deploy this model through a Python script like:
from mlflow.deployments import get_deploy_client
target_uri = 'aiplatform'
aiplatform = get_deploy_client(target_uri)
aiplatform.create_deployment()
target_uri='aiplatform'
communicates the target to MLflow. From there we can use mlflow.deployments
methods to get our AI-platform plugin and call the deployment core aiplatform.create_deploment()
Thus, we can finalise the deployment via a bash script, specifying the input env variables:
#!/bin/bash
export project_id='YOUR GCP PROJECT'
export production_bucket='YOUR PRODUCTION BUCKET'
export production_route='YOUR PATH WITHIN PROD BUCKET'
export run_id='2c6d5f4bc3bc4bf3b6df3ca80212a28d'
export tracking_uri='http://localhost:5000'
export model_name='YOUR MODEL NAME'
export version_name='VERSION NUMBER'
python deploy.py
The same result can be achieved via command line interface. In this case, rather than calling a python script, we’ll have:
# export statements as above
export model_uri="YOUR MODEL PRODUCTION URI"
mlflow deployments create -t aiplatform --name tester -m $model-uri
Finally, we can send request to get predictions on our data, as shown in fig.16. Here we are going to use googleapiclient.discovery
API, creating a json request with the input data to return model’s predictions.
That’s all for the MLflow side! I hope you enjoyed these articles about MLflow and its SDK development and plugin.
If you have any question or curiosity, just write me an email at [email protected]