Merging ONNX graphs
Join, Merge, Split, and concatenate ONNX graphs using sclblonnx.
ONNX is getting more and more popular. While initially conceived predominantly as a file-format to simply store AI/ML models, its use has changed in recent years. Nowadays, we see many data scientist use ONNX as means to build and curate complete data processing pipelines. As the usage of ONNX grows, so does the need for good tools for creating, inspecting, and editing ONNX graphs. Luckily, a large ecosystem for ONNX is emerging; in this post we describe the ONNX join, split, merge, and concatenate functionality as offered by the sclblonnx
package (curated by Scailable). Note that merging, splitting, and concatenating ONNX graphs is extremely useful when you are actively curating useful “subgraphs” of ONNX: i.e., you might have your preferred pre and post-processing steps in a data pipeline stored in ONNX format, and you want to join these subgraphs with a model you have just trained in TensorFlow or PyTorch. In this post I try to explain how this can be done.
Note: I have written about ONNX editing and merging before, see https://towardsdatascience.com/creating-editing-and-merging-onnx-pipelines-897e55e98bb0. However, with the release of
sclblonnx
0.1.9
the functionalities have been greatly extended.
Some ONNX background
Before discussing the new merge
, concat
, split
, and join
functionalities for ONNX graphs as provided by sclblonnx 0.1.9
, it is useful to provide a bit more background about ONNX graphs. At this point in the text I assume you know some of ONNX basics (if not, see this article, or this one). Thus, you know that ONNX provides a description of a directed computational graph which specifies which operations to execute on the (strongly typed) input tensors to produce the desired output tensor. And, you know that ONNX is useful for storing trained AI/ML models, and for creating data science pipelines, in a way that is platform and deployment target independent. I.e., you generally know ONNX is fun stuff.
However, to understand how one can merge, slit, join, and concatenate ONNX graphs we need a bit more background. We need to both understand how edges in a graph are created, and we need to understand in a bit more detail the role of the graph’s input and output.
Implicit edge creation by name
Let’s start with the creation of edges. Although an ONNX graph is simply a directed graph, and it could thus be described by its nodes and edges, this is not how we create (nor store) an ONNX graph. When creating an ONNX graph, we do not explicitly create an adjacency matrix to identify the edges between nodes. Rather, we create nodes of some type
(the different operators), each with a named input
‘s and output
's. This is also all that is stored in the ONNX file (which is actually just a protobuf): the file stores a list of operator types, each with their own named input(s) and output(s). The names in the end allow for the construction of the edges in the graph: If node n1
has an output named x1
, and node n2
has an input named x1
, a (directed)edge between n1
and n2
will be created. If subsequently another node, n3
is added, which also has a named input x1
we end up with the following graph:
Thus, when merging, joining, concatenating and splitting ONNX (sub)graphs, it is quite essential to know the — in some way internal, and potentially unknown to you if you have exported to ONNX from one of various training tools — input ant output names present in the graphs that you end up combining. If the same name occurs in both graphs, carelessly merging the graphs will cause potentially unwanted edges to be drawn.
If the same name occurs in both graphs, carelessly merging the graphs will cause potentially unwanted edges to be drawn.
Input and outputs to the graph
Another somewhat confusing, but quite essential, concept when merging, splitting, or otherwise editing ONNX graphs is the distinction between the inputs and outputs of a node (which, as we just discussed, are used to create the edges), and the input and output of the graph itself. Graph inputs and outputs represent the tensors that are being fed to the computational graph, and the tensors that result from carrying out the computations respectively. Input and outputs are implicitly connected to a graph in the same way as edges are created. Actually, a reasonable mental model of inputs and outputs to the graph is that they are simply nodes with either only an output (the inputs to the graph) or only an input (the outputs of the graph); the respective in and output of these special nodes, which do not operate on the tensors, are the outside world.
Ok, that was a bit cryptic.
Let’s give a few examples using the following notation:
I1(name) # An input (to the graph) with a specific name
O1(name) # An output (to the graph) with a specific name
N1({name, name, ...}, {name, name, ...}) # A node, with a list of inputs and outputs.
Given this notation we can for example denote
# A simple graph:
I1(x1)
I2(x2)
O3(x3)
N1({x1,x2},{x3})
Which would generate (using orange for inputs and outputs) the following graph:
If N1
is the Add operator, this graph would simple encoded adding two tensors.
Let’s do a slightly more complex graph:
I1(x1)
I2(x2)
N1({x1, x2}, {x3})
N2({x2, x3}, {x4})
O1(x3)
O2(x4)
Which would graphically result in:
Ok, so now we are clear on how the internal edges, and the inputs and outputs to the graph are constructed; let’s have a closer look at the tools in the sclblonnx
package!
Manipulating ONNX graphs using sclblonnx
From the update to version 0.1.9
, the sclblonnx
package contains a number of higher level utility functions to combine multiple ONNX (sub) graphs into a single graph. Although earlier versions of the package already contained the merge
function to effectively paste two graphs together (more on this later), the update presents merge
, join
, and split
as higher level wrappers around a much more versatile — and harder to use — function called concat
. Let’s start with the higher level functions.
All the functions described here can be found in python code in the examples presented with the
sclblonnx
package. These can be found at https://github.com/scailable/sclblonnx/blob/master/examples/example_merge.py. Also, please see the docs of each of the discussed functions: https://github.com/scailable/sclblonnx/blob/master/sclblonnx/merge.py
Merge
merge
effectively takes two graphs (a parent and a child), and paste the identified outputs off the parent to the identified inputs of the child. By default, merge
assumes that both graphs are complete (i.e., all edges nicely match up, and all inputs and outputs are defined. The signature of merge
is
merge(sg1, sg2, io_match)
Where sg1
is the parent subgraph, sg2
the child, and io_match
present a list of pairs of names of outputs of sg1
that need to be matched to inputs
off sg2
. Thus, given our developed notation in the previous section, if we have:
# Parent (sg1)
I1(x1)
N1({x1},{x2})
O1(x2)# Child (sg2)
I2(z1)
N2({z1},{z2})
O2(z2)
A call to merge(sg1, sg2, [(x2,z1)])
would create:
I1(x1)
N1({x1},{x2})
N2({x2},{z2})
O2(z2)
However, as you can image we can do much more versatile merges using this function. Note that merge
assumes there are no “conflicts” in the internal naming of the two graphs; if this is not the case and you would like to control this behavior in more detail, I recommend using concat
; merge is merely a user friendly wrapper around concat
.
Split & Join
Like merge
, split
and join
are also higher level wrappers around concat
(which we detail below). The behavior is relatively simple:
split
takes one “parent” with multiple outputs, and paste one subgraph to a subset of these outputs (by matching the inputs of the subgraph), and another subgraph to another subset of the output of the parent. Thus, effectively,split
creates a branch where the parent graph feeds in to two children.join
is the converse of split in many ways: it takes two “parents”, and only a single child. The parents are matched by their outputs to inputs off the child. Hence,join
effectively joins to branches of subgraphs in a larger tree.
Concat
The work-horse for the merge
, split
, and join
functions described above is the much more versatile concat
function. The easiest way to understand its functioning is perhaps a look at the signature and the docs:
def concat(
sg1: xpb2.GraphProto,
sg2: xpb2.GraphProto,
complete: bool = False,
rename_nodes: bool = True,
io_match: [] = None,
rename_io: bool = False,
edge_match: [] = None,
rename_edges: bool = False,
_verbose: bool = False,
**kwargs): """ concat concatenates two graphs. Concat is the flexible (but also rather complex) workhorse for the merge, join, and split functions and can be used to pretty flexibly paste together two (sub)graphs. Contrary to merge, join, and split, concat does not by default assume the resulting onnx graph to be complete (i.e., to contain inputs and outputs and to pass check()), and it can thus be used as an intermediate function when constructing larger graphs. Concat is flexible and versatile, but it takes time to master. See example_merge.py in the examples folder for a number of examples. Args:
sg1: Subgraph 1, the parent.
sg2: Subgraph 2, the child.
complete: (Optional) Boolean indicating whether the resulting
graph should be checked using so.check().
Default False.
rename_nodes: (Optional) Boolean indicating whether the names of
the nodes in the graph should be made unique.
Default True.
io_match: (Optional) Dict containing pairs of outputs of sg1 that
should be matched to inputs of sg2.
Default [].
rename_io: (Optional) Boolean indicating whether the inputs and
outputs of the graph should be renamed.
Default False.
edge_match: (Optional) Dict containing pairs edge names of sg1
(i.e., node outputs) that should be matched to edges of
sg2 (i.e., node inputs).
Default [].
rename_edges: (Optional) Boolean indicating whether the edges
should be renamed.
Default False
_verbose: (Optional) Boolean indicating whether verbose output
should be printed (default False) Returns: The concatenated graph g, or False if something goes wrong along the way.
"""## The implementation...
As is clear from the signature, the higher level wrapper merge
simply calls concat
once, pretty much with its default arguments. The functions split
and join
functions each call concat
twice to achieve their desired result. Note that the argument rename_edges
allows one to control whether all edges in the subgraph should be renamed (thus avoiding possibly unwanted implicit edge creation), whereas complete
allows one to use merge to operate on partial graphs (i.e., graphs that do not yet have all of their edges defined.
Wrap up
I hope the above shed some light on the immense possibilities ONNX has to offer, and on some of the tools to manipulate ONNX graphs manually. We think ONNX is great for storing (sub)graphs that do useful bits otherwise associated with a data science pipeline in a platform independent fashion. Tooling such as the sclblonnx
package subsequently enable users to create full pipelines using the subgraphs as building blocks.
In this post I have on purpose omitted issues regarding matching dimensions and types of inputs and outputs to nodes; I hope that by focussing on the rough structure of the graph(s) involved the operations are easier to follow; when creating actual functioning graphs obviously the types and dimensions of the various tensors involved are of importance.
Disclaimer
It’s good to note my own involvement here: I am a professor of Data Science at the Jheronimus Academy of Data Science and one of the cofounders of Scailable. Thus, no doubt, I have a vested interest in Scailable; I have an interest in making it grow such that we can finally bring AI to production and deliver on its promises. The opinions expressed here are my own. Note