From Keras model to Angular application

Published in

Towards Data Science

9 min readJul 9, 2018

Introduction

Working with TensorFlow Serving I thought, it would be really great to serve the Keras models too. The advantage of Keras is obvious — it significantly simplifies the model development and allows trying of the models much faster than the pure TensorFlow framework.

Another motivation, I wanted to make a client independent of the huge TensorFlow framework and use a very limited part of the Serving capabilities. And, of course, I want to visualize results without going through the boring JSON outputs :-)

Serve Keras models with TensorFlow Serving

Keras provides high-level neural network API and is capable to run on top of TensorFlow, CNTK or Theano. Basically, it abstracts those frameworks, is much easier to understand and learn and allows you to do more with less code.

TensorFlow Serving is a piece of software for hosting of machine learning models. The main use of it — high-performance production serving. It is written in C++ and uses a concept of Servables that client uses for computations.

How to talk to TensorFlow server

TensorFlow Serving provides gRPC API for execution of regression, prediction and inference tasks. Such gRPC API has a way better performance than a REST API over HTTP protocol, but cannot be so simply used by Web applications. So, in my eyes, gRPC is a perfect choice for internal clients but it should be wrapped by a service that provides REST API to the outside world.

Dog breed detector

For my example application, I took a dog breed detector that I implemented during my Udacity nano-degree course. The problem that we want to solve is a breed detection given a dog image. The model makes use of a convolutional neural network (CNN) architecture and a pre-trained on the ImageNet dataset model (I have chosen DenseNet 201). The model is implemented with Keras library.

The second part of the application is a Node.js service that wraps gRPC API of TensorFlow Serving and provides REST API to the outside world. The service depends on TensorFlow as less as possible — it uses modified protobufs files and gRPC library for server requests. So, we do not need to install any huge TensorFlow packages here.

And the last part is a very simple (and not so nice) Angular application that allows selection of dog images, sends requests to our wrapper service and displays breeds.

The code can be found in my GitHub repo. Feel free to copy, modify and use it if you find it useful :-)

From a model to the application

Let's dive into the implementation details. There are 3 main parts here:

Create and train the model with Keras and prepare it for TensorFlow Serving
Implement a wrapper service that provides REST API to the outside world
Create a simple application for a dog breed prediction and showing of the results

Dog breed detector model

I don't want to mess up the article with a lot of code. Instead, I will provide links to the implementation and explain the main challenges I had.

My approach for the model creation is fairly simple (you can follow it in dog_breed_detector_trainer.py) and is explained very well by Francois Chollet in Keras blog. The steps are:

Load pre-trained DenseNet 201 with the weights and without the top layers and extract so-called bottleneck features from the pre-trained network. I have implemented this in data_extractor.py.
Create a simple top model that uses extracted features as an input and has only Global Average Pooling and Fully-Connected layers. The model itself is implemented in dog_breed_detector_model.py.
Train the top model and save the checkpoints. You can find it here.
Create a final model that "joins" the pre-trained DenseNet 201 and the trained top model. This is implemented in final_model.py.
Prepare and save the final model for TensorFlow Serving. You can find it here.

The main challenge was to find a proper way to convert a Keras model to TensorFlow and prepare it for TensorFlow serving. Basically, we have two tasks here: convert Keras model to TF estimator and export the estimator to TensorFlow Serving.

Starting from the version 1.4, we can convert Keras models to TF estimators — simply call model_to_estimator() function and you are done!

tf_estimator = model_to_estimator(keras_model=model)

Now we can save the estimator for serving as described here. It is just a call of export_savedmodel() function with a receiver function for serving. Such a function creates an additional layer on top of the final model and is responsible for the input parsing. In our case, it converts the input JPEG image into a 3D tensor, which can be consumed by the model.

tf_estimator.export_savedmodel(export_dir,
    serving_input_receiver_fn,
    strip_default_attrs=True)

To create, train and prepare the model for serving, first install unzip (for unzipping the downloaded archive with dog images):

sudo apt-get update
sudo apt-get install unzip

Then clone the repository, change to a model serving directory, download and unzip the dog images and train the model:

git clone https://github.com/Vetal1977/tf_serving_keras.gitcd tf_serving_keras/model_servingcurl https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip --output dogImages.zipunzip dogImages.zip
mv dogImages dog_images
rm dogImages.zippython dog_breed_detector_trainer.py

My environment includes:

Conda 4.3.14
Python 3.5.4
GPU version of TensorFlow 1.8
Keras 2.1.6

Node.js wrapper service

The second component is a wrapper service that provides RESTful API to the outside world and talks gRPC to TensorFlow server. Additional requirement — as less dependency from TensorFlow as possible. I chose Node.js and Typescript for the service implementation.

The first step is a preparation of protobufs — I took them from the official repository and threw away everything that I didn't need. You can my modified versions here. I load the protobufs dynamically i.e. at the runtime and then create a prediction service like this:

this.tfServing = grpc.load(this.PROTO_PATH).tensorflow.serving;
this.client = new this.tfServing.PredictionService(
    this.tfServerUrl, grpc.credentials.createInsecure());

The advantage of dynamic loading is — you do not need to regenerate Typescript code by each protobufs modification. The disadvantage is a performance penalty. Since I'm loading the protobufs only once, this disadvantage is not critical.

Now, when the service gets called via REST interface we take the input data (image as a base64-encoded string) and create a gRPC request to TensorFlow server — please find details in the sources.

The wrapper service is a Node.js express application and uses inversify for dependency injection and inverisfy express utilities for REST API implementation.

The API base path of my service is /api/v1 and my controller implements the only endpoint /predict_breed that allows an images upload and calls a dog breed prediction at the TensorFlow server. To build a project, execute the following commands (I assume you have already cloned the repo):

cd tf_serving_keras/detector-api
npm install
npm build

My environment includes Node 8.11.3 and npm 6.1.0.

Angular application

And the last part is a simple Angular application with a button to select an image directory and an area for displaying the images with predicted breed names. Nothing fancy here — I used this guide to create a new Angular project and extended the code corresponding to my needs.

The client, which talks to the wrapper service, is implemented in detector.service.api.client.ts. The note to the implementation — I have an abstract class that declares a prediction method and 2 implementations of it — ones mentioned above and the second where I tried to use a brand new TensorFlow Serving RESTful API. I will provide some comments about it later.

We need to pay attention to the CORS mechanism. The Angular HttpClient rests on the XMLHttpRequest and I had to add the following header to my Node.js wrapper service to get the response in the application:

'Access-Control-Allow-Origin': '*'

Here is a typical application screen with dog images and predicted breeds:

To build a project, execute the following commands (I assume you have already cloned the repo):

cd tf_serving_keras/detector-app
npm install
npm build

Test locally with Docker

Honestly, I'm too lazy to start and run all 3 components separately :-) Docker and Docker Compose make my life easier. I need 3 Docker containers — one for TensorFlow Serving that hosts my model, one for the wrapper service and one for my application. I have the following versions installed: Docker 18.03.1-ce and Docker Compose 1.21.2.

Docker image for TensorFlow Serving

Before creating a Docker image, you must have an exported model for TensorFlow Serving in place — please see above how to do that. Last time it was a lot of effort to create a Docker container for TensorFlow Serving. The things changed since then and now we can install the Serving component with apt-get without cloning the repository and building a server ourselves.

echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.listcurl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -sudo apt-get update && sudo apt-get install tensorflow-model-server

I created a Dockerfile, where I executed those commands, copied prepared for the serving model and started the server. If you want to create a Docker image, please execute the following commands:

cd tf_serving_keras/model_serving
<activate your Python environment>
python dog_breed_detector_trainer.py
docker build -t model-serving:latest -f Dockerfile .

Docker image for Node.js wrapper service

The Dockerfile for the wrapper service bases on Node 8.11.3 image. It copies the sources to the image, builds them and starts the service. Nothing special, all standard.

Docker image for Angular application

The Dockerfile for my application uses multi-stage builds. First, we use Node 8.11.3 image to build the application and then Nginx image to hide it behind the Nginx server, which makes much sense in the production environment.

Compose them all together

We don't and shouldn't create 3 Docker containers one-by-one. Instead, we compose them together and make visible to each other with Docker Compose. In the docker-compose file, I have 3 services, which belongs to the same network. Application depends on the wrapper service, the wrapper service depends on the TensorFlow Serving. The services expose container ports and can communicate with each other by their names.

To run the complete application the only command, please execute

cd tf_serving_keras
docker-compose -f docker.compose.yaml up

Open your browser and go to localhost. You should be able to see the application, select the images and see the results. Don't forget to shut down the containers with

docker-compose -f docker.compose.yaml down

Trying TensorFlow Serving 1.8 and its RESTful API

When I was ready with my implementation I discovered that starting from the version 1.8 TensorFlow Serving provides RESTful API too. This is a pretty new feature and I wanted to try it.

Unfortunately, it has some problems. First, for a CORS mechanism, you have to have a special kind of proxy since you cannot change a server code. The most popular is cors-anywhere. I created a small wrapper and packed it into a Docker container. As mentioned previously, I implemented a client in my application that talks to TensorFlow server directly via the REST.

Second, you should include the image data in a JSON object that you send to the server. For big images, it is not a right way and I would always prefer multipart/form-data for that purpose.

If you want to try, look into client sources and start Docker containers with

docker-compose -f docker.compose.cors.yaml up

GPU support

If you have a computer with NVidia graphics card and installed CUDA and CUDnn libraries then we can use them in Docker too. But we need to make some preparations before:

Make sure that you have Docker Compose, at least, version 1.19.0
Install NVidia container runtime. I used the following commands:

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -distribution=$(. /etc/os-release;echo $ID$VERSION_ID)curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.listsudo apt-get updatesudo apt-get install nvidia-container-runtime

sudo tee /etc/docker/daemon.json <<EOF
{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
EOF
sudo pkill -SIGHUP dockerd

Now just run

docker-compose -f docker.compose.gpu.yaml up

and you should get a GPU-powered version of the application. You can find GPU enabled Dockerfile and Docker-compose files in the repository.

CAUTION: it could take up to a couple of hours to see the things up and running. The reason for that — we still need to compile a GPU version of TensorFlow Serving ourself to create a proper Docker image.

Conclusion

It was a good experience for me to implement a complete deep learning application starting from a Keras model up to the UI. There were some new things I tried and used and some challenges I needed to solve for running and testing everything locally. I hope you find this also useful for your purposes :-)