Publishing the model to get predictions on new data from Rest API

Getting predictions on the data from Rest API with the model hosted using TensorFlow Serving.

Aizaz Ali
Towards Data Science

--

So, you have a beautiful model which works like a charm on the data. Now, you want to put that model in production and get the prediction on new data.

Let me introduce to you TensorFlow Serving a system designed to serve trained model in production. By default, it comes with seamless integration with TensorFlow models, but the capabilities can be extended to other models as well.

Saving the model

So, let’s say that you have a trained model after following the steps in this article and you are using Google Colab.

The model for TensorFlow serving has to be saved with the versions and variables, which you can do with the code below:

Examining the model we just saved

This is helpful if there are several models in the TF Serving and we are unaware of the specifics of the models. This piece of code becomes invaluable in this case this will reveal the contents of the saved model.

Serving the model and getting the predictions

Serving allows us to manage the versions of models and prioritise which version to use. This is specially useful to either retrain the model or to facilitate transfer learning without compromising on the integrity of the architecture of the platform.

Now let’s use TensorFlow Serving’s REST API to make predictions:

This concludes the topic which covers the end to end process of building the model as covered in the previously mentioned article and then hosting the model to get the predictions on the new data using Rest API.

There are various ways to use serving and this topic itself expands and connects with serving the models with Google Cloud.

--

--