A first look at Pytorch 1.0

Nikhil B
Towards Data Science
6 min readOct 12, 2018

--

Image by Facebook

I had the pleasure of attending the first Pytorch Developer Conference this Oct 2nd. And it was a great experience! Pytorch has quickly become a first-class citizen among deep learning frameworks, having started out just two years ago. There were a lot of familiar faces with whom I caught up over the event. I’ll be covering those parts of the Pytorch 1.0 morning session that were most relevant to software developers. The session included talks by Facebook VP Jerome Pesenti and members of the Pytorch core dev team, and was geared towards introducing Pytorch 1.0 to the community.

Pytorch is primarily developed by Facebooks’s AI research team. The goal was to make AI open and collaborative so that more and more people can take advantage of it. This team was founded about 5 years ago and they provide core support to most of Facebook's products ever since. Facebook uses Artificial Intelligence extensively to enhance the features of their products. Some broad areas are (a) providing recommendations (newsfeed/articles you might like), (b) Machine translation (so everyone across the world can understand each other and (c) Creating bots and assistants for messaging. Facebook Research has a lot of partnerships in the industry and academia.

Pytorch is used for research and prototyping new models and systems. This framework is flexible and imperative and therefore easy to use. Next, this Pytorch model has to be ported using ONNX (Open Neural Network Exchange) into a more suitable form for production. ONNX is a format for deep learning models that enables porting them across different open source AI frameworks. The Production system in question is Caffe2. In Facebook, Caffe2 is the workhorse for most AI applications and is deployed at scale across data centers. The Caffe2 framework is robust, stable and powers 300 Trillion predictions per day. The Pytorch 1.0 goal is to combine the great features of all these 3 frameworks into a single one, in order to provide a seamless path from research to production.

The road to Pytorch 1.0

In the context of previous upgrades, this one is pretty significant (hence the jump to 1.0 from 0.4). Each Pytorch release previously addressed a core enhancement/improvement in the Pytorch framework as illustrated. The focus for Pytorch 1.0 is the introduction of torch.jit, a high-level compiler. The introduction of torch.jit was motivated by the need to address two major issues:

Separating the model from code: Prior to 1.0, the AI model *is* the code that was written in Pytorch, and in order to run it one needed a python VM/environment. This isn't the best solution for custom compute environments, mobile systems, multithreaded servers etc. Pytorch 1.0 allows you to separate the mode from the python code by providing function and class annotations in torch.jit. These functions/classes can be compiled into a high-level representation which can be inspectable, exportable and saved on disk.

Efficient model optimization for custom hardware: A Pytorch model could run on a server, TPU’s GPU’s, mobile devices etc. Pure imperative execution of Python code would miss a lot of optimization opportunities. Torch.jit models can be inspected ahead of time to perform whole program optimizations, node fusion etc at different levels of granularity.

The Pytorch 1.0 preview is supposed to be fairly stable and mature, supposedly with 90% of features are already in. At Facebook this system is already being used in production, and the team doesn't expect developers to be exposed to critical bugs. Expect the Stable release of 1.0 to come sometime around NIPS 2018.

A Deep dive into torch.jit

Eager execution mode is a great choice for research. We can use arbitrary python libraries in our code and hack a prototype quickly, using debugging tools of our choice. Things start to look more cumbersome once we have a small set of models that we want to take to production. The requirement of a python interpreter makes this uncomfortable for mobile environments or server setups with multithreaded environments. Even with python interpreter, there are limitations to how well the code can be optimized and parallelized.

The script mode code can be executed without the presence of a python interpreter. This is a model written using a subset of Python which can be optimized and converted into an Intermediate format. The script mode has all the basic building blocks required for building AI models, but it restricts other dynamic behaviors of the python language.

The ideal transition will look something like above. Most people will start off writing programs in the eager mode which is regular Pytorch code as we know it. When you want to move a particular model to production, one can use the pytorch.jit tools to annotate the code to take it from eager to script mode. And this can be done incrementally, starting with parts of the model you are most confident about, while still preserving debuggability and testing with the rest of the code. To actually do the transition from one form to the other, torch.jit provides two methods:

Torch.jit.trace — This works for existing eager python models which are ‘straight-line’ with no control flow (vision CNN models). The tracer runs the module using supplied inputs, recording tensor operations it encounters.

For example, one could run torch.jit.trace on a regular python function foo, also passing in the expected inputs. The tracer infrastructure records what operations have occurred and stored it in object traced_foo. traced_foo is a self-contained bundle can be run independently of python, be saved to disk and loaded etc.

Torch.jit.script — This is useful for models where control flow is important such as a custom RNN. Here we directly write code in Torch script, which is a subset of Python amenable to optimization. We use class and method annotations (@torch.jit.script, @torch.jit.script_method) to indicate torch script parts of the python code. Loops, print statements, and control flow is preserved since we are just using annotations over python code. To debug this part of code using standard python tools or to switch to eager execution mode, we just need to remove annotations.

In either case (trace mode or the script mode) the essential parts are still written in python and not in any intermediate meta-programming paradigm. A single model can contain a mix of script and trace code. Saving the model and loading it in another environment, say C++ is pretty straightforward as illustrated below.

Pytorch 1.0 and C++, Pytorch C++ Frontend

First, Pytorch 1.0 makes it easy to integrate existing C++ functions into your Python environment. For example, you want to use some OpenCV library functionality and bring it into Python code. Pytorch already has libraries like ATEN which lets you code in C++, and it has Autograd which introduces gradients and differentiability. It seems easy to support C++ in both the eager and script mode scenarios: (a) For eager mode, you could convert a C++ program code into a single Python module for experimentation (b) And for script mode, to use the C++ code itself in script mode

Second, for special environments, there is a C++ frontend which you can use to even run training loops, and not just inference. This might make sense for applications having low latency, bare metal, multithreaded requirements. The C++ frontend supports existing Pytorch features like torch::nn, torch::optim, serialization etc.

References:

Pytorch Developer Conference event. Visuals were taken from slides presented during the event. The Pytorch core dev speakers were Soumith Chintala, Dmytro Dzhulgakov, Zack DeVito, Peter Goldsborough and Teng Li

--

--