The world’s leading publication for data science, AI, and ML professionals.

Simplicial Complex Representation Learning

Learning entire simplicial complex representations using higher order geometric message passing schemes

Thoughts and Theory

Photo by Ronan Furuta on Unsplash
Photo by Ronan Furuta on Unsplash

Object representation learning aims to learn a mapping that embeds the elementary components of this object into some Euclidean space while preserving the object’s structural information. Recently, such methods have gained a great momentum especially with graph representation learning. Specifically, the latter has attracted considerable popularity over the past few years with success in both node-level representation learning. The applications of such representation on graphs are diverse as they can be used for almost any downstream machine learning task on domains such as graph classification or graph similarity.

The purpose of this article is to generalize graph representation learning to geometric/combinatorial objects called simplicial complexes [2]. Simplicial complexes are generalization of graphs and they can be used to store relations that go beyond the pairwise relations that are typically modeled with edges of graphs. Examples of simplicial complexes are given in the figure below.

Examples of simplicial complexes (SCs)
Examples of simplicial complexes (SCs)

In the previous figure, the left SC contains two faces. Each one of these faces can be thought of as a relation, or a high dimensional "edge", among the three edges that it bounds. From this perspective it is natural to generalize the familiar message passing on graphs [3] to simplicial complexes. Namely, higher order faces can be used to send messages between the lower dimensional faces that they bound. We consider this next.

Geometric Message Passing Schemes

Consider for instance the following example :

Figure 1: An example of message passing on a simplicial complex [1]
Figure 1: An example of message passing on a simplicial complex [1]

We seek to define deep learning protocols on simplicial complexes by generalizing the familiar message passing idea from graphs [3]. We will illustrate the computation graph with respect to the red node given in Figure 1. This red node is adjacent to v2, v3 and v3 via the edges e1,e3 and e2 respectively. We define a computational graph that takes these nodes and edges into account. This should be familiar from message passing scheme on graphs [3]. Now to consider higher order message passing protocols between edges, we look at the faces [1,7].

Figure 2: Going deeper in the message passing [1]
Figure 2: Going deeper in the message passing [1]

We observe from the figure that the blue edges are adjacent to the light and the dark pink edges in the sense that they all bound the same pink face. For this reason the computational graph appear as given in Figure 2. The same can be said about the other edges. Observe that the computation graph given in Figure 2 contains signal that are obtained from nodes, edges and faces. This message passing protocol effectively collects information from the neighborhood of the vertex in a radial fashion as appears in the following figure :

Figure 3: the message passing explained in Figure 2 collects local information in a radial fashion. Each time we go deeper in the network we collect more information as illustrated in the Figure.
Figure 3: the message passing explained in Figure 2 collects local information in a radial fashion. Each time we go deeper in the network we collect more information as illustrated in the Figure.

Three Message Passing Schemes

The above message passing scheme is called Adjacency message passing scheme (AMPS) as it takes into account the adjacency relation of a given simplicial complex. If we want to write an equation to describe this scheme it will be something like :

Adjacency message passing scheme equation [7]
Adjacency message passing scheme equation [7]

Here H is represent the vector of simplices embeddings. The subscript m is the dimension of the simplex and the superscript represent the computational stage. Theta is the training parameter. Observe that in the previous equation the update takes into account signal that is obtained from the m and m+1 dimensions. This is consistent with the Figures 1, 2 and 3 above. Furthermore, the above equation generalizes the familiar graph neural network message passing protocols :

A general purpose message passing equation on graphs [7]
A general purpose message passing equation on graphs [7]

Observe that essentially the equations are the same when m=0.

The case of simplicial complex is far more interesting however and other natural message passing protocols can be defined.

The _co-adjacency message passing scheme (CMPS)_is a message passing scheme that starts from the higher faces and takes the computation to lower dimensional faces as illustrated in the following figure :

Figure 4: The coadjacency message passing scheme [1]
Figure 4: The coadjacency message passing scheme [1]

Here the flow of information is given via :

Figure 5 :The flow of information in the CMPS
Figure 5 :The flow of information in the CMPS

Observe that the information flow from higher order faces to lower ones, opposite to the previous case. Such a scheme is useful if one, say, needs to classify faces of a given simplicial complex.

The equation of the above scheme is given as follows :

Note that it is very similar to the AMPS equation, with the only difference being the update here takes into account lower adjacent faces instead of higher adjacent faces.

The third message passing proctor that we introduce here is a one that takes into account the boundary and coboundary relations. It is easiest to see an example. We consider the case with respect to the pink edge. The boundary of this edge are the red and green nodes and the coboundary faces are the blue and the pink faces. This explains the computational graph in Figure (a) below

Figure 6: The boundary/coboundary message passing scheme (BCMPS)
Figure 6: The boundary/coboundary message passing scheme (BCMPS)

Going deeper into the network above gives us the computational graph given in Figure (b) above. The flow of information here is given in the following figure :

Figure 7: The flow of information in the BCMPS given in Figure 6.
Figure 7: The flow of information in the BCMPS given in Figure 6.

The above three message passing protocols first appeared in [1,7].

Simplicial Complex AutoEncoder

We now see how to use one of the schemes above (AMPS) to define a simplicial complex autoencoder. Given a simplicial complex X. A simplicial complex autoencoder [1,7] consists of three components :

(1)An encoder-decoder system, the trainable component of the autoencoder.

Specifically, the encoder and decoder are functions of the form

The encoder and decoder function on a simplicial complex
The encoder and decoder function on a simplicial complex

(2) A similarity measure on the objects of interest, which is a user-defined similarity function that represents a notion of similarity between the objects,

A similarity measure on a simplicial complex can be, say, its adjacency matrix.
A similarity measure on a simplicial complex can be, say, its adjacency matrix.

We want the encoder-decoder system specified above to learn a representation embedding of the simplices such that :

Here z_a denotes the embedding of the simplex a inside the Euclidian space. To satisfy the above equation, we define a loss function that forces the relations.

(3) A loss function, which is a user-defined function utilized to optimize the encoder-decoder system according to the similarity measure.

The above definition effectively defines an autoencoder on simplicial complexes. See also [4], [5] and [6] for related treatments.

Learning entire simplicial complex representation

Suppose that we have trained an encoder decoder system as we illustrated in the previous section. We seek to learn a simplicial complex-level embedding of the form:

Here _UX denotes the to the simplices embeddings of X that are induced by the trained encoder encoder (_enc,dec) (see [7] for details)_and

is a weight of the simplex embedding z_m. These weights can be learned via (for example) :

Finally, the entire simplicial complex embedding hX can be learned in multiple ways. For instance, given a collection of simplicial complex {Xi} one may learn a complex-to-complex proximity embeddings by minimizing the objective:

where

is an appropriately chosen distance matrix on the simplicial complexes {Xi}.

References

[1] Mustafa Hajij, Kyle Istvan, and Ghada Zamzmi. Cell complex neural networks. In NeurIPS Workshop on Topological Data Analysis and Beyond, 2020.

[2] Allen Hatcher, Algebraic topology, 2005.

[3] Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyal, O., and Dahl, G. E. Neural message passing from quantum chemistry. In Proceedings of the International Conference on Machine Learning, 2017

[4] Jacob Charles Wright Billings, Mirko Hu, Giulia Lerda, Alexey N Medvedev, Francesco Mottes, Adrian Onicas, Andrea Santoro, and Giovanni Petri. Simplex2vec embeddings for community detection in simplicial complexes.

[5] Michael T Schaub, Austin R Benson, Paul Horn, Gabor Lippner, and Ali Jadbabaie. Random walks on simplicial complexes and the normalized hodge 1-laplacian. SIAM Review, 62(2):353–391, 2020.

[6] Celia Hacker. k-simplex2vec: a simplicial extension of node2vec. NeurIPS workshop on Topological Data Analysis and Beyond, 2020

[7] Mustafa Hajij, Ghada Zamzmi, and Xuanting Cai, Simplicial complex representation learning, arXiv preprint arXiv:2103.04046 (2021)


Related Articles