How to Make a Treemap in Python

Use Plotly to make a treemap with a slider to adjust the depth

Kruthi Krishnappa
Towards Data Science

--

Full Final Treemap — Scroll to Bottom For Interactive Visualization (Image by Author)

Introduction

A treemap is a visualization used to display hierarchical data. The traditional treemap consists of nested rectangles, however, there are also circular treemaps that consist of nested circles. Overall, any shape suitable for nesting can be used. The treemap is best used to show nodes in a proportional relation to the size of their parent.

Dataset

The dataset being used is flare.json which contains hierarchical data about the Flare class. The data contains the name of each node, the children, and the value of each node which will be used as the sizes for the nodes in the treemap.

Download here: https://github.com/kruthik109/Data-Visualization/blob/main/Advanced-Visualizations/flare.json (Bostock, 2021)

System Requirements

The treemap in this article will be built using the Plotly library in Python due to its ease of use and ability to add interactive features like a slider to control depth and a tooltip. The libraries needed are the json, anytree, ipywidgetsm, plotly, and pandas libraries. I used Google Colab to implement my visualization as it simplifies the library importation process.

import json
!pip install anytree
from anytree import PostOrderIter
from anytree.importer import DictImporter
import ipywidgets as widgets
import plotly.graph_objs as go
import pandas as pd

Step 1: Import Data

The data is formatted as a JSON file. For ease of use import the file with the JSON library.

with open(‘flare.json’) as f:
js_data = json.loads(f.read())

The JSON format is tricky to work with so the DictImporter from the anytree library is used to simplify the process. This library transforms the data to be accessed in the same way as a tree by giving direct access to children, parents, and commands for traversals.

#imports dictionary in a tree form
importer = DictImporter()
root = importer.import_(js_data)

Step 2: Calculate Size for all Nodes

In the flare data, only the leaf nodes have values attached. To get the size of each node sum the values for each of its corresponding leaf nodes. The recursive algorithm below uses an altered form of DFS where it starts from the leaf nodes and recursively adds the child nodes values to the parent node.

def format(node):
for i in node.children:
#check if node as attribute value
if hasattr(i, ‘value’) == False:
format(i)
#check if node parent as attribute value
if hasattr(i.parent, ‘value’):
i.parent.value += i.value
#if node parent doesn’t have a value set to same val as child
elif hasattr(i.parent, ‘value’)== False:
i.parent.value = i.value
#insert step 3 code hereformat(root)

Step 3: Format Parameters for Treemap

In order to make a treemap in Plotly the parameters to be used have to be in a list or DataFrame format. In the code below three lists are being created one for the size, name, parent nodes, and level which will be discussed further in step five. This code will be inserted at the end of the format function

size = []
name = []
parent = []
level = []
#append parent to parent list
parent.append(i.parent.name)
#append node name to name list
name.append(i.name)
#append node size to size list
size.append(c)
#get the level of each node by taking the length of its ancestors
#used for step 5
level.append(len(i.ancestors))

In the format function, the data for the root node never gets appended to the list as it is never a child node so we add those values outside the format function.

#append attributes for root
level.append(0)
name.append(root.name)
parent.append(“”)
size.append(root.value)

Step 4: Create the Treemap

Using the lists above, create the treemap. The treemap function in Plotly has two required parameters, labels and parents. The labels list correspond to the text displayed for each node. The parent list is used to determine which node the current node will be nested in. The value list is optional and used to assign sizes to each node. The size list contains the sizes of each node, and if this is not given the sizes will be automatically determined by Plotly.

#create plotly figure
fig = plotly.graph_objs.Figure()
fig.add_trace(plotly.graph_objs.Treemap(
labels = name,
parents = parent,
values = size
))
#show figure
fig.show()

Step 5: Add a Slider (Optional)

The treemap in Plotly has many levels and depending on the depth and complexity of the data the treemap can become difficult to interpret. A solution to this problem is to add a slider, which would allow the user to control the depth displayed in the visualization for readability. The code created above, with a few additions, can easily be changed to include a slider.

Step 5a: Create a Depth Parmamter

First, as seen in step four the level parameter that determines the depth of each node will be used to control what nodes are shown when the slider is changed. The node.ancestor functionality that the anytree library provides simplifies this process. If the traditional node class in Python is being used a traversal can be done to count the number of ancestors for each node.

Step 5b: Create DataFrame with Parameters

The second step is to create a data frame with the lists created in step three as columns. This will be needed for the next step which is creating the new treemap.

#create DataFrame
df = pd.DataFrame()
df[‘parent’] = parent
df[‘name’] = name
df[‘value’]= size
df[‘level’] = level

Step 5c: Function to Create Treemap with Slider Values

Using the DataFrame created in step four, make a function to create the treemap and slider. The treemap can be made similarly to the way it was built in step four, except it will use the columns from the data frame instead of the lists. To incorporate the slider into the treemap use a simple DataFrame filter command. The first filer is checking which nodes in the level are less than the current slider value. The second filter is the name of the corresponding column.

#create figure and slider
def update(sliderVal):
fig = plotly.graph_objs.Figure()
fig.add_trace(plotly.graph_objs.Treemap(
labels = df[df[‘level’]<sliderVal][‘name’],
values = df[df[‘level’]<sliderVal][‘value’],
parents = df[df[‘level’]<sliderVal][‘parent’]
))

Update the root color to be light gray to improve the clarity since the original color is white. Then adjust the node sizes so they are proportional to their parent with branchvalues. Change the layout to fit the desired space. I adjusted the height and width to be 900 by 900.

fig.update_traces(root_color=”#f1f1f1", branchvalues =’total’, width = 900, height = 900)

Show the adjusted plot.

fig.show()

Step 5d: Create a Widget and Connect the Update Function

Connect the widget to the update function and set the range of the slider from the min depth to the max depth.

widgets.interact(update, sliderVal = (0, 5))

Final Output

Visualization by Author

Hover over the nodes in the visualization above to see their values. Click on a node to zoom in and see a more detailed view of its children. Use the slider to control the depth of nodes to be shown in the visualization. Currently, the slider is at the maximum value so sliding it down will reduce the detail.

Conclusion

Treemaps made with Plotly allow for clear visualization of hierarchical data. The viewer is able to see the size of the nodes in proportion to their parents, this makes it very useful for comparisons. A treemap is only one type of visualization that can be used for hierarchical data. Stay tuned for my next article where I will discuss more types of hierarchical data visualizations.

The full code can be found here: https://github.com/kruthik109/Data-Visualization/blob/main/Advanced-Visualizations/enclosure_diagram.ipynb

Citations

Bostock, M. (2021, October 27). A JSON file with the Flare Class Hierarchy. https://gist.github.com/mbostock/1044242#file-readme-flare-imports-json. license: gpl-3.0

--

--