The world’s leading publication for data science, AI, and ML professionals.

How to Deploy Tensorflow Models in C++ in 3 different ways

A summary of my journey of doing this in the last 4 years

made by the author
made by the author

Introduction

In machine learning projects, there are several ways to deploy your final model. It could be cloud deployment, deployment on mobile devices, deployment on embedded systems, etc.

For this, you can leverage several programming languages depending on the tech stack you’re using in your day job or project.

In some companies that I worked for, we needed to deploy our Deep Learning models in a C++ environment. These models were mostly dealing with image classification and object detection.

I remember asking a question on stackoverflow almost 4 years ago on how I can deploy a tensorflow model in a C++ environment. I received only 2 answers in this 4 years period!

That’s when I first started looking into the different options of deploying tensorflow models in a C++ environment.

I’ve mainly tried 3 options because they seemed the most promising:

  1. Deploy the models using OpenCV DNN module.
  2. Deploy the models using Tensorflow C++ API.
  3. Deploy the models using ONNX Runtime.

Before I tell you what happened with each option, let me ask you this. Are you currently looking to learn how to use Tensorflow for computer vision? If yes, then check out my free tensorflow course.

Now, here’s what happened with each of the options I mentioned above.

Deploying Tensorflow models in C++ using OpenCV DNN module

opencv logo
opencv logo

The first option that I remember trying was the OpenCV DNN module. This was back in 2018.

DNN stands for deep neural networks, and this is a module that was introduced by OpenCV developers to enable integration of deep neural networks inside processes that were already running OpenCV code.

This turned out to be a quick fix for what we were aiming to do but it was a very limiting approach since this module was still new back then and it offered limited options to deploy deep neural networks.

For example, I needed to deploy InceptionV2 model but I couldn’t, since OpenCV DNN module only supported InceptionV1 architecture.

That DNN module has evolved in recent years and it seems to offer a lot more flexibility to deploy models for classification and object detection.

The other two options that I have tried were Tensorflow C++ API and ONNX Runtime.

Deploying models using Tensorflow C++ API

tensorflow logo
tensorflow logo

Tensorflow is built using C++ and it offers an API to make it relatively easier to deploy models (and even train models if you wish to) in C++. This sounds great, until you start trying to implement a program that uses this API.

As a developer, you know that the documentation is very important, but Tensorflow’s C++ API is very very very limited. It’s very hard to find the information you’re looking for just by reading the documentation.

Secondly, there is no support for Windows 32bits. So if you (or your clients) have such a system then you’re facing a wall here and it’s better to try looking for other options.

But the great thing about this API is that you don’t have to worry about the compatibility of your model that you trained in Python and you want to deploy it using the C++ API. Especially if you’re using the same version of Tensorflow in Python and C++.

I’ve personally deployed image classification models and object detection models using this API, and apart from the limiting factors that I mention above, the models worked exactly as expected.

The last option that I personally tried when it comes to deploying Tensorflow models in C++ is ONNX Runtime.

Deploying models using ONNX Runtime

image from wikipedia
image from wikipedia

ONNX stands for Open Neural Networks Exchange and it’s a whole ecosystem that aims to standardize the representation of Machine Learning models. It’s developed by Microsoft.

What ONNX aims to do is make it easier to deploy any kind of machine learning model, coming from any type of ML framework including Tensorflow.

To deploy Tensorflow models using Onnx in C++, you need to do 2 things:

  1. Convert your TensorFlow model to ONNX format. There is an open source tool for this called tf2onnx.
  2. Use ONNX Runtime to deploy your converted model.

I’ve personally tested this approach on so many deep learning models and it works great. For example, I converted almost all of the models that are in the tensorflow object detection API into ONNX format and I was able to run inference with them with no problem.

I fell in love with this tool after it was suggested to me by a friend and after seeing all of its capabilities.

About the author

I am a Machine Learning Engineer working on solving challenging computer vision problems. Follow me on LinkedIn and Twitter for bite-size daily ML content. Also, get articles like this straight into your inbox by joining my newsletter.


Related Articles