(Image by author)

Merging ONNX graphs

Join, Merge, Split, and concatenate ONNX graphs using sclblonnx.

Maurits Kaptein
Towards Data Science
9 min readAug 19, 2021

--

ONNX is getting more and more popular. While initially conceived predominantly as a file-format to simply store AI/ML models, its use has changed in recent years. Nowadays, we see many data scientist use ONNX as means to build and curate complete data processing pipelines. As the usage of ONNX grows, so does the need for good tools for creating, inspecting, and editing ONNX graphs. Luckily, a large ecosystem for ONNX is emerging; in this post we describe the ONNX join, split, merge, and concatenate functionality as offered by the sclblonnx package (curated by Scailable). Note that merging, splitting, and concatenating ONNX graphs is extremely useful when you are actively curating useful “subgraphs” of ONNX: i.e., you might have your preferred pre and post-processing steps in a data pipeline stored in ONNX format, and you want to join these subgraphs with a model you have just trained in TensorFlow or PyTorch. In this post I try to explain how this can be done.

Note: I have written about ONNX editing and merging before, see https://towardsdatascience.com/creating-editing-and-merging-onnx-pipelines-897e55e98bb0. However, with the release of sclblonnx 0.1.9 the functionalities have been greatly extended.

Some ONNX background

Before discussing the new merge, concat, split, and join functionalities for ONNX graphs as provided by sclblonnx 0.1.9, it is useful to provide a bit more background about ONNX graphs. At this point in the text I assume you know some of ONNX basics (if not, see this article, or this one). Thus, you know that ONNX provides a description of a directed computational graph which specifies which operations to execute on the (strongly typed) input tensors to produce the desired output tensor. And, you know that ONNX is useful for storing trained AI/ML models, and for creating data science pipelines, in a way that is platform and deployment target independent. I.e., you generally know ONNX is fun stuff.

However, to understand how one can merge, slit, join, and concatenate ONNX graphs we need a bit more background. We need to both understand how edges in a graph are created, and we need to understand in a bit more detail the role of the graph’s input and output.

Implicit edge creation by name

Let’s start with the creation of edges. Although an ONNX graph is simply a directed graph, and it could thus be described by its nodes and edges, this is not how we create (nor store) an ONNX graph. When creating an ONNX graph, we do not explicitly create an adjacency matrix to identify the edges between nodes. Rather, we create nodes of some type (the different operators), each with a named input‘s and output's. This is also all that is stored in the ONNX file (which is actually just a protobuf): the file stores a list of operator types, each with their own named input(s) and output(s). The names in the end allow for the construction of the edges in the graph: If node n1 has an output named x1, and node n2 has an input named x1, a (directed)edge between n1 and n2 will be created. If subsequently another node, n3 is added, which also has a named input x1 we end up with the following graph:

(Image by author)

Thus, when merging, joining, concatenating and splitting ONNX (sub)graphs, it is quite essential to know the — in some way internal, and potentially unknown to you if you have exported to ONNX from one of various training tools — input ant output names present in the graphs that you end up combining. If the same name occurs in both graphs, carelessly merging the graphs will cause potentially unwanted edges to be drawn.

If the same name occurs in both graphs, carelessly merging the graphs will cause potentially unwanted edges to be drawn.

Input and outputs to the graph

Another somewhat confusing, but quite essential, concept when merging, splitting, or otherwise editing ONNX graphs is the distinction between the inputs and outputs of a node (which, as we just discussed, are used to create the edges), and the input and output of the graph itself. Graph inputs and outputs represent the tensors that are being fed to the computational graph, and the tensors that result from carrying out the computations respectively. Input and outputs are implicitly connected to a graph in the same way as edges are created. Actually, a reasonable mental model of inputs and outputs to the graph is that they are simply nodes with either only an output (the inputs to the graph) or only an input (the outputs of the graph); the respective in and output of these special nodes, which do not operate on the tensors, are the outside world.

Ok, that was a bit cryptic.

Let’s give a few examples using the following notation:

I1(name)  # An input (to the graph) with a specific name
O1(name) # An output (to the graph) with a specific name
N1({name, name, ...}, {name, name, ...}) # A node, with a list of inputs and outputs.

Given this notation we can for example denote

# A simple graph:
I1(x1)
I2(x2)
O3(x3)
N1({x1,x2},{x3})

Which would generate (using orange for inputs and outputs) the following graph:

(Image by author)

If N1 is the Add operator, this graph would simple encoded adding two tensors.

Let’s do a slightly more complex graph:

I1(x1)
I2(x2)
N1({x1, x2}, {x3})
N2({x2, x3}, {x4})
O1(x3)
O2(x4)

Which would graphically result in:

(Image by author)

Ok, so now we are clear on how the internal edges, and the inputs and outputs to the graph are constructed; let’s have a closer look at the tools in the sclblonnx package!

Manipulating ONNX graphs using sclblonnx

From the update to version 0.1.9, the sclblonnx package contains a number of higher level utility functions to combine multiple ONNX (sub) graphs into a single graph. Although earlier versions of the package already contained the merge function to effectively paste two graphs together (more on this later), the update presents merge, join, and split as higher level wrappers around a much more versatile — and harder to use — function called concat. Let’s start with the higher level functions.

All the functions described here can be found in python code in the examples presented with the sclblonnx package. These can be found at https://github.com/scailable/sclblonnx/blob/master/examples/example_merge.py. Also, please see the docs of each of the discussed functions: https://github.com/scailable/sclblonnx/blob/master/sclblonnx/merge.py

Merge

merge effectively takes two graphs (a parent and a child), and paste the identified outputs off the parent to the identified inputs of the child. By default, merge assumes that both graphs are complete (i.e., all edges nicely match up, and all inputs and outputs are defined. The signature of merge is

merge(sg1, sg2, io_match)

Where sg1 is the parent subgraph, sg2 the child, and io_match present a list of pairs of names of outputs of sg1 that need to be matched to inputs off sg2. Thus, given our developed notation in the previous section, if we have:

# Parent (sg1)
I1(x1)
N1({x1},{x2})
O1(x2)
# Child (sg2)
I2(z1)
N2({z1},{z2})
O2(z2)

A call to merge(sg1, sg2, [(x2,z1)]) would create:

I1(x1)
N1({x1},{x2})
N2({x2},{z2})
O2(z2)

However, as you can image we can do much more versatile merges using this function. Note that merge assumes there are no “conflicts” in the internal naming of the two graphs; if this is not the case and you would like to control this behavior in more detail, I recommend using concat; merge is merely a user friendly wrapper around concat.

Split & Join

Like merge, split and join are also higher level wrappers around concat (which we detail below). The behavior is relatively simple:

  • split takes one “parent” with multiple outputs, and paste one subgraph to a subset of these outputs (by matching the inputs of the subgraph), and another subgraph to another subset of the output of the parent. Thus, effectively, split creates a branch where the parent graph feeds in to two children.
  • join is the converse of split in many ways: it takes two “parents”, and only a single child. The parents are matched by their outputs to inputs off the child. Hence, join effectively joins to branches of subgraphs in a larger tree.

Concat

The work-horse for the merge, split, and join functions described above is the much more versatile concat function. The easiest way to understand its functioning is perhaps a look at the signature and the docs:

def concat(
sg1: xpb2.GraphProto,
sg2: xpb2.GraphProto,
complete: bool = False,
rename_nodes: bool = True,
io_match: [] = None,
rename_io: bool = False,
edge_match: [] = None,
rename_edges: bool = False,
_verbose: bool = False,
**kwargs):
""" concat concatenates two graphs. Concat is the flexible (but also rather complex) workhorse for the merge, join, and split functions and can be used to pretty flexibly paste together two (sub)graphs. Contrary to merge, join, and split, concat does not by default assume the resulting onnx graph to be complete (i.e., to contain inputs and outputs and to pass check()), and it can thus be used as an intermediate function when constructing larger graphs. Concat is flexible and versatile, but it takes time to master. See example_merge.py in the examples folder for a number of examples. Args:
sg1: Subgraph 1, the parent.
sg2: Subgraph 2, the child.
complete: (Optional) Boolean indicating whether the resulting
graph should be checked using so.check().
Default False.
rename_nodes: (Optional) Boolean indicating whether the names of
the nodes in the graph should be made unique.
Default True.
io_match: (Optional) Dict containing pairs of outputs of sg1 that
should be matched to inputs of sg2.
Default [].
rename_io: (Optional) Boolean indicating whether the inputs and
outputs of the graph should be renamed.
Default False.
edge_match: (Optional) Dict containing pairs edge names of sg1
(i.e., node outputs) that should be matched to edges of
sg2 (i.e., node inputs).
Default [].
rename_edges: (Optional) Boolean indicating whether the edges
should be renamed.
Default False
_verbose: (Optional) Boolean indicating whether verbose output
should be printed (default False)
Returns: The concatenated graph g, or False if something goes wrong along the way.
"""
## The implementation...

As is clear from the signature, the higher level wrapper merge simply calls concat once, pretty much with its default arguments. The functions split and join functions each call concat twice to achieve their desired result. Note that the argument rename_edges allows one to control whether all edges in the subgraph should be renamed (thus avoiding possibly unwanted implicit edge creation), whereas complete allows one to use merge to operate on partial graphs (i.e., graphs that do not yet have all of their edges defined.

Wrap up

I hope the above shed some light on the immense possibilities ONNX has to offer, and on some of the tools to manipulate ONNX graphs manually. We think ONNX is great for storing (sub)graphs that do useful bits otherwise associated with a data science pipeline in a platform independent fashion. Tooling such as the sclblonnx package subsequently enable users to create full pipelines using the subgraphs as building blocks.

In this post I have on purpose omitted issues regarding matching dimensions and types of inputs and outputs to nodes; I hope that by focussing on the rough structure of the graph(s) involved the operations are easier to follow; when creating actual functioning graphs obviously the types and dimensions of the various tensors involved are of importance.

Disclaimer

It’s good to note my own involvement here: I am a professor of Data Science at the Jheronimus Academy of Data Science and one of the cofounders of Scailable. Thus, no doubt, I have a vested interest in Scailable; I have an interest in making it grow such that we can finally bring AI to production and deliver on its promises. The opinions expressed here are my own. Note

--

--

I am professor at Tilburg University working on bandits, causality, and Bayesian inference with various applications. I also care about AI & ML deployment.