Hands-on Tutorials

The way machine learning with graphs helps to build prediction models is very similar to how well-known unsupervised and semi-supervised approaches are being applied to supervised models. This means in general there are two ways that machine learning with graphs can be deployed into the ML workflow. The first way to do that is by creating a so-called node embedding and passing that into a downstream machine learning task. The second way to apply machine learning with graphs is by doing the label and link predictions directly on the graph data structure. Earlier I’ve written an introduction to machine learning with graphs and what tasks are included. This article is an addition to that post and will focus on giving a concise overview of how these tasks are embedded into the ML workflow.
Machine Learning development workflow
A typical machine learning development workflow consists of the following phases given in blue boxes:

The machine learning development workflow starts with data retrieval, where (mostly) unstructured raw data is being acquired from the sources. Thereafter in the data preparation step, the goal is to transform the data in a structured format so that it can be used for model training. After the model is trained it can be evaluated using various performance and validation methods. During the data preparation phase, a key task called feature engineering has to be done. "Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in improved model accuracy on unseen data." Thus, __ data scientists have to engineer and decide which features are important in predicting the target label. Therefore, having a proper business domain knowledge of the use case is a prerequisite to perform the feature engineering task.
Feature engineering with graph data
This is where feature engineering using graphs structured data can help. The underlying assumption why this works is that data points do not exist in isolation, they occur through interaction with other records in the data set. Instead of relying on the creativity and domain knowledge of the data scientist to come up with features that are important according to them, one can build a machine learning model during this phase that generates features based upon relational properties between the data points (records) in the dataset. Those features are called node embeddings, where every node is a data point. Thus, every node embedding is a representation of a structural relationship that it has with other data points in the data set. Interactions between nodes with their neighborhood nodes are captured in the resulted node embedding. There is a wide range of methods to create a node embedding. Many of those methods aggregate properties of the local neighborhood nodes and edges into the embedding using a graph neural network (GNN) or a deep learning variant of it such as graph convolutional network (GCN). In its essence, the GNN tries to encode the node properties and the relationships it has with other nodes into a vector in a latent space. However, these graph learning techniques are another topic that I will address in a separate post. The result of this phase is a node embedding that will be passed as a feature to a downstream task for any classification or regression prediction.

Training and predicting on the graph
In the approach above, the outcome from feature engineering results into a node embedding that then will be used as an input feature vector for another downstream machine learning model. However, it is also possible to predict the nodes or links between the nodes directly from the graph-structured data. These tasks are so-called node classification and link prediction. The key here is that, instead of hand-picking which relationship structure is leading to better predictions, the model is being trained to find the most important structural pattern and set of node properties to assign labels and predict relationships.

In general, node classification is achieved by training a graph neural network in a fully supervised manner, where a loss function is defined on a node embedding (created in the earlier feature engineering step described above) and a one-hot vector that indicates the class to which the nodes belongs to (the label). Thus, the model will be trained upon the node embedding and the output of the graph neural network will label each node. As a result, the node has been labeled so there is no need to pass the node embedding as a feature vector into a downstream task for classification or regression.
Conclusion
First, machine learning with graphs can replace the feature selection task during the feature engineering phase. Where traditional machine learning workflow relies on the data scientists’ insight to select features, ML with graphs trains a graph neural network to output a feature vector, called node embedding, for every node. This node then can be passed into any downstream classifier. Secondly, it also possible to do predictions directly on the graph data structure. In this case, a node embedding will be passed into a graph neural network to predict its label.
Sources
Graph Representation Learninghttps://www.cs.mcgill.ca/~wlh/grl_book/
Representation Learning on Graphs: Methods and Applicationshttps://arxiv.org/abs/1709.05584
Discover Feature Engineeringhttps://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-how-to-get-good-at-it/