Converting a Simple Deep Learning Model from PyTorch to TensorFlow

Published in

Towards Data Science

6 min readMay 22, 2019

Introduction

TensorFlow and PyTorch are two of the more popular frameworks out there for deep learning. There are people who prefer TensorFlow for support in terms of deployment, and there are those who prefer PyTorch because of the flexibility in model building and training without the difficulties faced in using TensorFlow. The downside of using PyTorch is that the model built and trained using this framework cannot be deployed into production. (Update in Dec 2019: It is claimed that later versions of PyTorch have better support for deployment, but I believe that is something else to be explored.) To address the issue of deploying models built using PyTorch, one solution is to use ONNX (Open Neural Network Exchange).

As explained in ONNX’s About page, ONNX is like a bridge that links the various deep learning frameworks together. To this end, the ONNX tool enables the conversion of models from one framework to another. Up to the time of this writing, ONNX is limited to simpler model structures, but there may be further additions later on. This article will illustrate how a simple deep learning model can be converted from PyTorch to TensorFlow.

Installing the necessary packages

To start off, we would need to install PyTorch, TensorFlow, ONNX, and ONNX-TF (the package to convert ONNX models to TensorFlow). If using virtualenv in Linux, you could run the command below (replace tensorflow with tensorflow-gpu if you have NVidia CUDA installed). Do note that as of Dec 2019, ONNX does not work with TensorFlow 2.0 yet, so please take note of the version of the TensorFlow that you install.

source <your virtual environment>/bin/activate
pip install tensorflow==1.15.0# For PyTorch, choose one of the following (refer to https://pytorch.org/get-started/locally/ for further details)
pip install torch torchvision # if using CUDA 10.1
pip install torch==1.3.1+cu92 torchvision==0.4.2+cu92 -f https://download.pytorch.org/whl/torch_stable.html # if using CUDA 9.2
pip install torch==1.3.1+cpu torchvision==0.4.2+cpu -f https://download.pytorch.org/whl/torch_stable.html # if using CPU onlypip install onnx# For onnx-tensorflow, you may want to refer to the installation guide here: https://github.com/onnx/onnx-tensorflow
git clone https://github.com/onnx/onnx-tensorflow.git
cd onnx-tensorflow
pip install -e ..

If using Conda, you may want to run the following commands instead:

conda activte <your virtual environment>
conda install -c pytorch pytorchpip install tensorflow==1.15.0pip install onnx# For onnx-tensorflow, you may want to refer to the installation guide here: https://github.com/onnx/onnx-tensorflow
git clone https://github.com/onnx/onnx-tensorflow.git
cd onnx-tensorflow
pip install -e ..

I find that installing TensorFlow, ONNX, and ONNX-TF using pip will ensure that the packages are compatible with one another. It is OK, however, to use other ways of installing the packages, as long as they work properly in your machine.

To test that the packages have been installed correctly, you can run the following commands:

python
import tensorflow as tf
import torch
import onnx
from onnx_tf.backend import prepare

If you do not see any error messages, it means that the packages are installed correctly, and we are good to go.

In this example, I used Jupyter Notebook, but the conversion can also be done in a .py file. To install Jupyter Notebook, you can run one of the following commands:

# Installing Jupyter Notebook via pip
pip install notebook# Installing Jupyter Notebook via Conda
conda install notebook

Building, training, and evaluating the example model

The next thing to do is to obtain a model in PyTorch that can be used for the conversion. In this example, I generated some simulated data, and use this data for training and evaluating a simple Multilayer Perceptron (MLP) model. The following snippet shows how the installed packages are imported, and how I generated and prepared the data.

I then created a class for the simple MLP model and defined the layers such that we can specify any number and size of hidden layers. I also defined a binary cross-entropy loss and Adam optimizer to be used for the computation of loss and weight updates during training. The following snippet shows this process.

After building the model and defining the loss and optimizer, I trained the model for 20 epochs using the generated training set, then used the test set for evaluation. The test loss and accuracy of the model was not good, but that does not really matter here, as the main purpose here is to show how to convert a PyTorch model to TensorFlow. The snippet below shows the training and evaluation process.

After training and evaluating the model, we would need to save the model, as below:

Converting the model to TensorFlow

Now, we need to convert the .pt file to a .onnx file using the torch.onnx.export function. There are two things we need to take note here: 1) we need to define a dummy input as one of the inputs for the export function, and 2) the dummy input needs to have the shape (1, dimension(s) of single input). For example, if the single input is an image array with the shape (number of channels, height, width), then the dummy input needs to have the shape (1, number of channels, height, width). The dummy input is needed as an input placeholder for the resulting TensorFlow model). The following snippet shows the process of exporting the PyTorch model in the ONNX format. I included the input and output names as arguments as well to make it easier for inference in TensorFlow.

After getting the .onnx file, we would need to use the prepare() function in ONNX-TF’s backend module to convert the model from ONNX to TensorFlow.

Doing inference in TensorFlow

Here comes the fun part, which is to see if the resultant TensorFlow model can do inference as intended. Loading a TensorFlow model from a .pb file can be done by defining the following function.

With the function to load the model defined, we need to start a TensorFlow graph session, specify the placeholders for the input and output, and feed an input into the session.

The output of the snippet above would look like below. The names of the placeholders correspond to those specified in the torch.onnx.export function (indicated in bold).

(<tf.Tensor 'Const:0' shape=(50,) dtype=float32>,)
(<tf.Tensor 'Const_1:0' shape=(50, 20) dtype=float32>,)
(<tf.Tensor 'Const_2:0' shape=(50,) dtype=float32>,)
(<tf.Tensor 'Const_3:0' shape=(50, 50) dtype=float32>,)
(<tf.Tensor 'Const_4:0' shape=(1,) dtype=float32>,)
(<tf.Tensor 'Const_5:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'input:0' shape=(1, 20) dtype=float32>,)
(<tf.Tensor 'flatten/Reshape/shape:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'flatten/Reshape:0' shape=(1, 20) dtype=float32>,)
(<tf.Tensor 'transpose/perm:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'transpose:0' shape=(20, 50) dtype=float32>,)
(<tf.Tensor 'MatMul:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'mul/x:0' shape=() dtype=float32>,)
(<tf.Tensor 'mul:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'mul_1/x:0' shape=() dtype=float32>,)
(<tf.Tensor 'mul_1:0' shape=(50,) dtype=float32>,)
(<tf.Tensor 'add:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'Relu:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'flatten_1/Reshape/shape:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'flatten_1/Reshape:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'transpose_1/perm:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'transpose_1:0' shape=(50, 50) dtype=float32>,)
(<tf.Tensor 'MatMul_1:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'mul_2/x:0' shape=() dtype=float32>,)
(<tf.Tensor 'mul_2:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'mul_3/x:0' shape=() dtype=float32>,)
(<tf.Tensor 'mul_3:0' shape=(50,) dtype=float32>,)
(<tf.Tensor 'add_1:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'Relu_1:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'flatten_2/Reshape/shape:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'flatten_2/Reshape:0' shape=(1, 50) dtype=float32>,)
(<tf.Tensor 'transpose_2/perm:0' shape=(2,) dtype=int32>,)
(<tf.Tensor 'transpose_2:0' shape=(50, 1) dtype=float32>,)
(<tf.Tensor 'MatMul_2:0' shape=(1, 1) dtype=float32>,)
(<tf.Tensor 'mul_4/x:0' shape=() dtype=float32>,)
(<tf.Tensor 'mul_4:0' shape=(1, 1) dtype=float32>,)
(<tf.Tensor 'mul_5/x:0' shape=() dtype=float32>,)
(<tf.Tensor 'mul_5:0' shape=(1,) dtype=float32>,)
(<tf.Tensor 'add_2:0' shape=(1, 1) dtype=float32>,)
(<tf.Tensor 'output:0' shape=(1, 1) dtype=float32>,)

If all goes well, the result of print(output) should match that of print(dummy_output) in the earlier step.

Conclusion

ONNX can be pretty straightforward, provided that your model is not too complicated. The steps in this example would work for deep learning models with single input and output. For models with multiple inputs and/or outputs, it would be more challenging to convert them via ONNX. As such, an example to convert multiple input/output models would have to be done in another article, unless there are new versions of ONNX later on that can handle such models.

The Jupyter notebook containing all the codes can be found here.

Converting a Simple Deep Learning Model from PyTorch to TensorFlow

Written by Yu Xuan Lee