The world’s leading publication for data science, AI, and ML professionals.

Customizing NetworkX Graphs

Your One Stop Shop for All Things NetworkX

Data Science

Graph theory is an incredibly potent Data science tool that allows you to visualize and understand complex interactions. As part of an open-source project, I’ve collected information from many primary sources to build a graph of relationships between professional theatre lighting designers in New York City.

Image by Author
Image by Author

I used NetworkX, a Python package for constructing graphs, which has mostly useable defaults, but leveraging matplotlib allows us to customize almost every conceivable aspect of the graph. I knew what I wanted it to look like in my head, but after many hours of searching through documentation and StackOverflow I decided to create this one stop shop for all the things I learned how to change! Now you too can build readable graphs to help visualize complex relationships.


Creating a NetworkX Graph

We will start by making a basic graph! There are several ways to do this. I found that the easiest way to do this was from a pandas DataFrame where you specify the edges. What’s an edge? Well, graphs are built using nodes and edges. A node represents some object, perhaps a person or organization, and an edge represents the actual connection from one node to another node. So in the example below, "A", "B", "C", and "D" are nodes and the lines between them are the edges.

Image by Author
Image by Author

Node Color

We can also change the color of all the nodes quite easily. You’ll notice a pattern that changing a feature globally for the graph is quite simple (using keywords in the .draw() method).

Image by Author
Image by Author

Node Color by Node Type

But let’s say that we want to change the color of nodes specifically by type, instead of globally. This takes a little setup, but once in place we can quickly add new types and automatically color accordingly. Basically, we create another DataFrame where we specify the node ID and node type and use the pd.Categorical() method to apply a colormap.

So now our letter nodes are colored blue and our number nodes are colored orange!

Image by Author
Image by Author

Node Size

Altering node size globally is, again, quite simple via a keyword argument in the .draw() method – just specify node_size!

Image by Author
Image by Author

Node Size by Node Type

We can alter node size by type just like we can for color! I find this very useful for connecting people to organizations because organizations have many associated people so it makes sense to think of them as hubs with people as the spokes.

So we will build from our node color by type example, but instead of a single keyword argument for node_size we will pass in a list of node sizes referencing the node type used to choose node color.

Here’s the list comprehension logic if anyone is struggling –

For each node in the DataFrame, set the node size to 4000 if that node’s type is not "Letter", otherwise set the node size to 1000. The result is that anything that’s not a letter will be a larger node. With only two node types at the moment this might be overkill, but it will scale better later.

node_sizes = [4000 if entry != 'Letter' else 1000 for entry in carac.type]
Image by Author
Image by Author

Manual Node Size

If we don’t need to change node size by type, but just want to draw attention to specific nodes, we can manual specify a list of sizes. These need to be in the same order as the nodes are stored, so call G.nodes() to generate a list to follow. Some representative sizes are labelled below so you can get a sense of their relative size. I find 5000 to be a good middle ground where a first and last name can fit comfortably.

Image by Author
Image by Author

Edge Width

Now that we’ve covered node attributes, we can move to edges. Setting a global edge size or color is as simple as for nodes, just specify the width keyword in the .draw() method.

Image by Author
Image by Author

Edge Color

Edges can be colored or sized individually instead of globally by passing in lists of attributes instead of single values. So below we have edge_colors and edge_widths which will be cycled through.

Image by Author
Image by Author

Node Border Color

Finally, we can also add a colored border to the nodes with a confusingly named keyword "edgecolors", which is not the same as "edge_color". This can be used to help clarify and separate nodes, which you can see in the example graph below.

Image by Author
Image by Author

Graph Layout

One of the most important aspects of a graph is how it’s laid out! This will ultimately determine the readability and usefulness of the graph. NetworkX has many options for determining the layout, of which I cover the most popular 4 below. The default is the spring_layout which is used in all above cases, but others have merit based on your use case. I recommend trying several to see what works best.

You can check out the layout documentation here.

Image by Author
Image by Author

An Example Network – Tying it Together

So here’s a fully realized example from my project described above. I created a relationship map of prominent professional lighting designers along with some preeminent universities and organizations in the world of theatre design. The goal is to determine how personal connections affect the tight-knit world of theatre designers.

You’ll notice that the text itself can be altered, too. The code is all below, but you can use keywords like font_size and font_weight. Additionally, newline characters "n" are accepted in node titles and often increase readability. For example, the node for John Gleason is listed as "JohnnGleason" in the DataFrame.

Image by Author
Image by Author

Conclusion

I hope that this guide gives you working examples of how to customize most aspects of NetworkX graphs to increase readability. NetworkX is an incredibly powerful package, and while its defaults are quite good, you’ll want to draw attention to different information as your projects scale. That can be done in many ways, but changing node size and color, edge width, and graph layout is a great place to start.


Connect

I’m always looking to connect and explore other projects! You can follow me on GitHub or LinkedIn, and check out my other stories on Medium. I also have a Twitter!


Related Articles