GRAND SigmaJS

Integrate SigmaJS network visualizations with GRAND stack (GraphQL, React, Apollo, Neo4j)

Tomaz Bratanic
Towards Data Science

--

Airport routes visualized with the SigmaJS. Image by author.

I’ve worked on a couple of projects based on graph data, and one thing they all had in common was a need for a custom network visualization/interaction tool. In my previous post, I have investigated how to incorporate VisJS into the GRAND stack framework. Unfortunately, the VisJS doesn’t scale well when dealing with thousands of nodes and relationships. Luckily, my friend Jan Zak pointed me towards SigmaJS library, which should be better at visualizing large networks. I was usually scared of creating visualizations containing more than 1000 nodes and relationships as they can easily crash your browser, but not anymore thanks to SigmaJS performance. The above visualization of airport routes contains around 3500 nodes and 35000 relationships and works just fine. I would say that SigmaJS is geared towards more serious network visualization development as you can customize network visualizations and develop user interactions for a greater experience. Check out their demo application for some hints at what it can do.

In this blog post, I will be using the GRAND stack in combination with SigmaJS to create nice-looking network visualizations.

Grand stack. Image from https://grandstack.io/. Content is licensed under CC BY 4.0.

GRAND stack uses Neo4j as the data store. There is typically a React application on the frontend, and the data exchange between React and Neo4j is handled with GraphQL. Check out the official documentation for a detailed explanation.

I have prepared a GitHub repository that contains all the code and the instructions to seed the Neo4j database. In this blog post, I will quickly walk you through the code and data structure.

Neo4j graph construction

First, we need to seed the database. We will be using the data from OpenFlights webpage. OpenFlights is a tool that lets you map your flights around the world and share your flights with the public, if you wish. All the data is available under Open Database license. I’ve downloaded the routes dataset from Kaggle, and placed it in my Git repository to ease the seeding process.

Airport routes graph schema. Image by author.

We have a simple graph model that contains airports and routes between airports. In addition, we have some additional data about airports, such as their country and location. The ROUTE relationships are directed and weighted. The weight of the relationship represents how many routes, probably by different airlines, are between two airports. You can seed the data with a simple script if you are using Linux or MacOS.

cat seed_data.cql | docker exec -i neo4j cypher-shell -u neo4j -p letmein

I am not familiar with Windows bash scripts, but you can simply copy/paste the Cypher queries into Neo4j Browser if you are using Windows.

The script will also execute PageRank and Louvain algorithms in addition to importing the routes. PageRank is a centrality algorithm used to find the most important or influential nodes in a graph, while the Louvain algorithm is used to detect a graph’s community structure. We will use the PageRank score to determine the size of the nodes in the visualization and Louvain communities to color the nodes accordingly.

GraphQL server

Next, we need to develop a GraphQL server that will fetch information from Neo4j and make it available in our React Application. Luckily for us, a Neo4j GraphQL library makes this process a walk in the park. It is a low-code library designed for building GraphQL applications on top of a Neo4j graph database. We need to define the GraphQL schema types, and the library will automagically create resolver functions to fetch or update the data in Neo4j.

That’s all the configuration we need to have a working GraphQL server. We’ve defined a type Airport. The type name should be identical to the node label in Neo4j. You can simply specify which node properties you want to expose and their types. I have also excluded the CREATE, UPDATE, and DELETE operations as I am only interested in retrieving data from Neo4j and not updating it. The incoming and outgoing routes fields define that we want to traverse the ROUTE relationship either in incoming or outgoing direction. We can also add relationship properties by specifying the interface and adding it as property field of the relationship.

Of course, you can always add authorization and custom resolvers if you want. Check the documentation for more information.

React application with SigmaJS

I’ve prepared two versions of the airport routes visualization. One uses the Force Atlas Layout algorithm to calculate the layout of nodes in the visualization.

Airport routes with Force Atlas layout algorithm. Image by the author.

Since we have the latitude and longitude information available, I wanted to test if we can just input them as x and y coordinates and see how it goes.

Airport routes with geographical layout. Image by author.

I’m not used to things just working out of the box, so this was a pleasant surprise. I’ve also copied some of the features that are available in the official SigmaJS demo. On the left side, we have the zoom and the center visualization button, and on the right side, I’ve added category filters that you can use to filter countries in the visualization.

For example, we can exclude all countries besides the USA in our visualization.

Airport routes between airports in USA. Image by author.

Conclusion

Times are changing. You don’t have to be scared anymore of having to produce network visualizations with thousands of nodes and relationships. In this example, we are visualizing 3500 nodes and 38000 relationships and the browser doesn’t even break a sweat. From my beginner React perspective, SigmaJS promises a lot of potential to add customization and user interactivity. If you are planning of developing a custom network visualization or interaction tool on top of Neo4j, I would definitely recommend to integrate SigmaJS into the GRAND stack. If you have any cool ideas what could be added to this project, please open an issue or a pull request.

As always, the code is available on GitHub.

--

--

Data explorer. Turn everything into a graph. Author of Graph algorithms for Data Science at Manning publication. http://mng.bz/GGVN