Publishing the model to get predictions on new data from Rest API
Getting predictions on the data from Rest API with the model hosted using TensorFlow Serving.
So, you have a beautiful model which works like a charm on the data. Now, you want to put that model in production and get the prediction on new data.
Let me introduce to you TensorFlow Serving a system designed to serve trained model in production. By default, it comes with seamless integration with TensorFlow models, but the capabilities can be extended to other models as well.
Saving the model
So, let’s say that you have a trained model after following the steps in this article and you are using Google Colab.
The model for TensorFlow serving has to be saved with the versions and variables, which you can do with the code below:
Examining the model we just saved
This is helpful if there are several models in the TF Serving and we are unaware of the specifics of the models. This piece of code becomes invaluable in this case this will reveal the contents of the saved model.
Serving the model and getting the predictions
Serving allows us to manage the versions of models and prioritise which version to use. This is specially useful to either retrain the model or to facilitate transfer learning without compromising on the integrity of the architecture of the platform.
Now let’s use TensorFlow Serving’s REST API to make predictions:
This concludes the topic which covers the end to end process of building the model as covered in the previously mentioned article and then hosting the model to get the predictions on the new data using Rest API.
There are various ways to use serving and this topic itself expands and connects with serving the models with Google Cloud.