Have you ever worked with hierarchical data and struggled to visualise it properly? Then this article is for you! I recently stumbled across the handy circlepackeR package in R that enables you to create interactive charts that facilitate the visualisation of hierarchical data and wanted to share my experiences with you.
Prerequisites
In order to create a similar visualisation as depicted above, you first need to install the following packages:
devtools::install_github("jeromefroe/circlepackeR")
install.packages("data.tree")
install.packages("htmlwidgets")
library(circlepackeR)
library(data.tree)
library(htmlwidgets)
Data
In this short example I will be working with US county data that contains the state name, county name and its population:
head(counties)
# A tibble: 3,093 x 5
country state county population
<chr> <chr> <chr> <dbl> United States Alabama Autauga 55380
United States Alabama Baldwin 212830
United States Alabama Barbour 25361
United States Alabama Bibb 22493
United States Alabama Blount 57681
United States Alabama Bullock 10248
Preprocessing
In order to get the data ready, we need to apply two preprocessing steps:
1. Add a pathString column
counties$pathString <- paste("world", counties$country, counties$state, counties$county, sep = "/")
The new column should be called "pathString" and concatenate all the hierarchical levels you want to be displayed into one long string that starts with the word "world" and uses "/" as a separator. This is how the dataframe with the newly added column looks after this step:
country state county population pathString
<chr> <chr> <chr> <dbl> <chr>
United States Alabama Autauga 55380 world/United States/Alabama/Autauga
United States Alabama Baldwin 212830 world/United States/Alabama/Baldwin
United States Alabama Barbour 25361 world/United States/Alabama/Barbour
United States Alabama Bibb 22493 world/United States/Alabama/Bibb
United States Alabama Blount 57681 world/United States/Alabama/Blount
United States Alabama Bullock 10248 world/United States/Alabama/Bullock
2. Convert dataframe to a data.tree
data structure
The second preprocessing step consists of converting the dataframe into a data.tree data structure. This sounds complicated but can be easily done with the following function:
nodes <- as.Node(counties)
And this is how the newly created structure looks like:
levelName
world
° - United States
¦ - Alabama
¦ ¦ - Autauga
¦ ¦ - Baldwin
¦ ¦ - Barbour
¦ ¦ - Blount
¦ ¦ - Bullock
..
We are ready to create our visualisation!
Create Circlepacker Graph
Creating the circlepackeR graph couldn’t be easier. Simply run the following code:
#create circlepacker graph
p <- circlepackeR(nodes,
size="population",
color_min=rgb(255/255,0/255,0/255),
color_max=rgb(0/255,0/255,0/255),
width=1200,
height=1200)
#save the circle graph
saveWidget(p, file="circles.html")
Note the arguments above:
- size: Column name that should be used as the size of each circle.
- color_min: Minimum value of the colour range for the circles. Can be either hexadecimal, RGB, or HSL colour.
- color_max: Maximum value of the colour range for the circles. Can be either hexadecimal, RGB, or HSL colour.
- width: width **** (pixels) of the created circles graph (Especially useful if your graph is large)
- height: height **** (pixels) of the created circles graph (Especially useful if your graph is large)
After saving the graph, you will see an HTML file in the specified directory that can be shared with others either via email or embedded in a webpage. Furthermore, you can embed it in a medium article using codepen.io (copy the HTML source code and create a codepen from it as followed:
Adapting the graph
The graph usually turns out nicely but sometimes the font size of the exported graph is a little bit too small. Unfortunately, the font size and any other styling options can’t be set during the creation of the graph, but there is something you can do about it!
Since the exported file is an HTML file, you can simply open it in basically any editor and search for the CSS declarations:
<style type="text/css">.circlepackeR .node {
cursor: pointer;
}
.circlepackeR .node:hover {
stroke: #000;
stroke-width: 1.5px;
}
.circlepackeR .node - leaf {
fill: white;
}
.circlepackeR .label {
font: 11px "Helvetica Neue", Helvetica, Arial, sans-serif;
text-anchor: middle;
text-shadow: 0 1px 0 #fff, 1px 0 0 #fff, -1px 0 0 #fff, 0 -1px 0 #fff;
}
.circlepackeR .label,
.circlepackeR .node - root,
.circlepackeR .node - leaf {
pointer-events: none;
}
</style>
If you have some CSS knowledge you will find it easy to modify the layout of the graph. If never used CSS before here is how you can change the font size and change the colour of the circle outline:
Change font style
Simply change the font size by specifying the number of pixels (e.g. 15px). Furthermore, you can the font name from "Helvetica Neue" to any desired font:
.circlepackeR .label {
font: 15px "Helvetica Neue", Helvetica, Arial, sans-serif;
text-anchor: middle;
text-shadow: 0 1px 0 #fff, 1px 0 0 #fff, -1px 0 0 #fff, 0 -1px 0 #fff;
}
Change circle outline colour
If you want to change the colour of the circle outline (stroke) you can simply specify the desired HEX code here (e.g. #FF0000):
.circlepackeR .node:hover {
stroke: #FF0000;
stroke-width: 1.5px;
}
I hope this was useful to some of you!
Further reading
[1] jeromefroe – D3 Zoomable Circle Packing Visualization (tutorial): http://jeromefroe.github.io/circlepackeR/
[2] circlepackeR Github Repository: https://github.com/jeromefroe/circlepackeR