Advanced Visualization with react-vis

Using Voronois, single pass rendering, and canvas components for amazing user experiences

Andrew McNutt
Towards Data Science

--

So you’ve started to do some data visualization with react-vis, you’ve built some charts of your own, maybe you’ve read Shyianovska Nataliia ‘s wonderful introduction Data Visualization with react-vis, or perhaps even built a dashboard. It’s time to learn some techniques for handling larger sizes of data and to handle more complex UI interactions. In the course of this article we will see how to use single pass rendering, canvas components, debounced state updates, and Voronois. Buckle in, it’s gonna be a wild ride!

To get started you can set up scaffold a new app using create-react-app, and then run in the terminal

npm install react-vis --save
# we will also be using a few additional libraries
# so install these too
npm install --save d3-fetch d3-scale d3-scale-chromatic debounce simplify-js

That said, I use a slightly different configuration for my apps, which you can check out here (along with the code for this whole article).

Getting and preparing the data

A traditional technique for learning how to do anything is to imitate the masters, and this article will be no exception. We will be exploring New York Times’s 2016 visualization “Stephan Curry’s 3 Point Record in Context”.

There’s a lot going on in this visualization! Each line shows the number of three pointers made by a particular player over the course of a particular season. This information is made accessible by a dynamic mouseover which simultaneously highlights that particular player-year and provides a tooltip which describes exactly which line the user is hovering over.

Our first step will be to acquire the data. Pleasantly, NYT serves it’s data for this article in CSV’s so we can pretty easily grab that by watching the network tab in chrome, as below. Once you’ve downloaded the data file, place it somewhere within your application, I put it under a folder called data and named it “nyt-rip.csv”.

Capturing the CSV data that we will be using for our visualization by watching the network tab on Chrome. Refreshing the page triggers a lot of network calls, we then filter the network history to only CSV’s, and save the relevant file by double clicking it.

The format of this csv is a little inconvenient, it has columns for a player id, the players name, the year, and then the number of three throws for each game of the season. Let’s clean this up into a format thats a little easier to work with:

const nonNumericRows = {
player: true,
pname: true,
year: true
};
// reformats input csv so that each row
// has details about the corresponding player
// and an array of points describing that players record
export function cleanData(data) {
return data.map(row => {
// extract all of the columns for
// this row that update that players score
const gameData = Object.keys(row)
.filter(key => !nonNumericRows[key] && row[key] !== 'NA')
.map((key, x) => ({x, y: Number(row[key])}));
// return a formatted object to manipulate
return {
player: row.player,
pname: row.pname,
year: row.year,
height: gameData[gameData.length - 1],
gameData
};
});
}

Excellent, now we have a mechanism to format our data in a usable way. While we’re writing utility functions, let’s also write one to grab the data domain:

export function getDataDomain(data) {
const {min, max} = data.reduce((acc, player) => {
return player.gameData.reduce((mem, row) => {
return {
min: Math.min(mem.min, row.y),
max: Math.max(mem.max, row.y)
};
}, acc);
}, {min: Infinity, max: -Infinity});
return [min, max];
}

Put both of these functions into a utils.js file, and we’re off to the races.

A first naive approach

To set the stage for our subsequent optimizations let’s throw together a naive approach. There are 750 player-years rows in our data set, so let’s just have 750 LineSeries. When we hover over them, let’s redraw the lines and highlight the selected one. Pretty reasonable! Here is a full pass at the naive implementation, after which I’ll go through and describe each part:

Oof thats a lot all at once! But it’s pretty easy to deal with if break it apart in chunks. The first thing our component does when it gets ready to mount is to call out for data via the d3-fetch csv function, we then clean the data and save it into the state for ingestion in the near future.

componentWillMount() {
csv('data/nyt-rip.csv')
.then(data => this.setState({
data: cleanData(data),
loading: false
}));
}

We start our render function by refusing to try to render our component if the data hasn’t been loaded yet, instead just giving back a loading message (though you could easily give show a spinner or use something fancy like placeloader). Next we build up a color scale powered by d3-scale-chromatic using our getDataDomain function from the previous section. In this article we are primarily interested in reconstructing the feel (as opposed to the exact look) of the original NYT visualization, so here we use a different color scale, and forgo some of their extra chart decorations.

const {loading, highlightSeries, data} = this.state;
if (loading) {
return <div><h1>LOADING</h1></div>;
}
const dataDomain = getDataDomain(data);
const domainScale = scaleLinear().domain(dataDomain).range([1, 0]);
const colorScale = val => interpolateWarm(domainScale(val));

Finally, we come to the actual rendering of our chart. We start by looking over all of our data rows and creating a LineSeries for each of them along with defining a rudimentary interaction technique. Next we add a LabelSeries to highlight only certain points along the yAxis, and the xAxis with special formatting to match the labels given by the NYT graphic.

<XYPlot {...layout}>
{data.map((player, idx) => {
const playerHighlightString = `${player.pname}-${player.year}`;
return (
<LineSeries
key={idx}
strokeWidth="4"
curve="curveStepBefore"
data={player.gameData}
onSeriesMouseOver={() =>
this.setState({highlightSeries: playerHighlightString})}
onSeriesMouseOut={() =>
this.setState({highlightSeries: null})}
stroke={
playerHighlightString === highlightSeries ? 'black' :
colorScale(player.gameData[player.gameData.length - 1].y)
}
/>);
})}
<LabelSeries
data={data.filter(row => labelMap[`${row.pname}-${row.year}`])}
style={{fontSize: '10px', fontFamily: 'sans-serif'}}
getY={d => d.gameData[d.gameData.length - 1].y}
getX={d => 82}
labelAnchorX="start"
getLabel={d =>
`${d.pname} - ${d.gameData[d.gameData.length - 1].y}`}/>
<XAxis
style={{ticks: {fontSize: '10px', fontFamily: 'sans-serif'}}}
tickFormat={d =>
!d ? '1st game' : (!(d % 10) ? `${d}th` : '')}/>
</XYPlot>

The astute reader will observe that we’ve used a couple of constants, imported from a separate constants.js file, which are pulled out for logical convience:

export const desiredLabels = [
{pname: 'Brian Taylor', year: 1980},
{pname: 'Mike Bratz', year: 1981},
{pname: 'Don Buse', year: 1982},
{pname: 'Mike Dunleavy', year: 1983},
{pname: 'Larry Bird', year: 1986},
{pname: 'Danny Ainge', year: 1988},
{pname: 'Michael Adams', year: 1989},
{pname: 'Michael Adams', year: 1990},
{pname: 'Vernon Maxwell', year: 1991},
{pname: 'Vernon Maxwell', year: 1992},
{pname: 'Dan Majerle', year: 1994},
{pname: 'John Starks', year: 1995},
{pname: 'Dennis Scott', year: 1996},
{pname: 'Reggie Miller', year: 1997},
{pname: 'Dee Brown', year: 1999},
{pname: 'Gary Payton', year: 2000},
{pname: 'Antoine Walker', year: 2001},
{pname: 'Jason Richardson', year: 2008},
{pname: 'Stephen Curry', year: 2013},
{pname: 'Stephen Curry', year: 2014},
{pname: 'Stephen Curry', year: 2015},
{pname: 'Stephen Curry', year: 2016}
];
export const layout = {
height: 1000,
width: 800,
margin: {left: 20, right: 200, bottom: 100, top: 20}
};
export const NUMBER_OF_GAMES = 82;
export const MAX_NUMBER_OF_THREE_POINTERS = 405;

Putting it all together, this looks like:

Pretty cool! However, the responsiveness is not very good, lines don’t highlight when we get close to them, and the browser noticeably appears to lag. This strategy also prevents us from adding the tooltip along the side (while we could it would be subject to a lot of jitter as we hover across various elements). There’s gotta be a better way!

A better architecture

So far we’ve been rendering our series using SVG lines. While this approach makes it easy to reason about the state of the UI, it is REALLY inefficient to redraw all of our lines every time. This is due to the fact that each of these lines is modeled as a very detailed DOM node, which weighs quite heavily on the browser. To alleviate this weight we can use the built in react-vis canvas series, LineSeriesCanvas. Canvas tends to be quite a lot faster to render than SVG, but doesn’t have the same detailed representation in the DOM, which means that any interactions have to be hand brewed. Evidently, dropping this new series into our naive solution will make the whole page faster, but we will lose our dynamic interactivity.

To address this we will separate our chart into two components, one that handles interactivity, and one that handles everything else. This is motivated by the idea by the idea that React only executes the render function for components that have been updated.

The organization of our chart component after breaking out the interactive parts into a separate component

Through this architecture we will have a component that renders the canvas lines, and one that will render the the highlight line and highlight tooltip. Thus effectively separating the elements that will be quick to render from those that will be expensive or time consuming to render. Some psuedo-code for the layout:

<div className="relative">
<NonInteractiveParts />
<InteractiveParts />
</div>

We want these components to appear to be a single beautiful chart, so we give the interactive parts css properties

position: absolute;
top: 0;

This allows the interactive parts to “sit on top of” the non-interactive properties, thus completing the look.

The static parts

We’re really getting somewhere now. Observe that the static parts of the of chart are pretty similar to the what we had in the naive approach; just a container with some series in it. In the interest of brevity we can combine the root component and canvas part illustrated above into a single component, as each of them is only rendered a single time.

This new component is quite similar to our first one. Our mounting step does a little extra work to facilitate the rendering of the interactivity component — more on that in a moment— and makes a call to our soon to be becoming InteractiveComponents. But other than that, not much has changed!

The interactive part

Here comes the cool stuff. We saw above that we prepared our data a little bit extra in the componentWillMount step to get it ready for the interactive component, let’s take a more detailed look:

componentWillMount() {
csv('data/nyt-rip.csv')
.then(data => {
const updatedData = cleanData(data);
const playerYearMap = updatedData.reduce((acc, row) => {
const {pname, year, gameData} = row;
acc[`${pname}-${year}`] = gameData[gameData.length - 1].y;
return acc;
}, {});
const playerMap = updatedData.reduce((acc, row) => {
acc[`${row.pname}-${row.year}`] = row;
return acc;
}, {});
this.setState({
data: updatedData,
loading: false,
allPoints: buildVoronoiPoints(updatedData),
playerYearMap,
playerMap
});
});
}

Our cleaned data is just the same as before, squeaky and easy to use. Next we introduce a new variable variable called playerYearMap, this is an object with keys equal to the player-year unique identifiers and values equal to the maximum score number of three pointers each player reached. This will be used to simplify the positioning of the labels and the tooltip. Similarly we introduce a playerMap, which also also has player-year identifiers as keys, but this time the whole row as values. This will allow for fast/constant time lookup of rows as we mouse over things.

The last new variable is called allPoints and is generated by a function called buildVoronoiPoints. What could that be for I wonder? Well here is the function (it lives in utils):

export function buildVoronoiPoints(data) {
return data.reduce((acc, {player, pname, year, gameData}) => {
return acc.concat({
player,
pname,
year,
x: 41,
y: gameData[gameData.length - 1].y
});
}, []);
}

This creates a single point for each player in the center of the of the domain at the “height” of that players maximum number of three pointers. We can use this to create a Voronoi. A Voronoi is a partitioning of space plane such that for a collection of points where each point is contained within it’s own cell. We are guaranteed that each point will be isolated from each other point. This property can be used to great effect for mouse overs, so the user only mouses over a single point at a time.

A colored in voronoi diagram, source wikipedia

We are imitating the mouseover setup found in the original NYT Graphic, where the plane is stripped into bands such that as you mouse up and down the currently selected player-year changes, and as you mouse left and right it stays the same. We can reconstruct this behavior by using a Voronoi with our specially built allPoints from before. Once implemented the layout of the voronoi cells will look like this:

(When we’re done we will remove the stroke attribute from the Voronoi cells shown here, that way our user is none the wiser about our mouse over technique 😉)

Pretty cool! Now we’re ready to see the code for the interactive component.

Just as in our naive approach let’s go through the interesting parts one at a time. Let’s start with the elephant in the room:

<Voronoi
extent={[
[0, y(MAX_NUMBER_OF_THREE_POINTERS)],
[width, height - margin.bottom]
]}
nodes={allPoints}
polygonStyle={{
// UNCOMMENT BELOW TO SEE VORNOI
stroke: 'rgba(0, 0, 0, .2)'
}}
onHover={row => {
const player = playerMap[`${row.pname}-${row.year}`];
if (!player) {
this.setState({
highlightSeries: null,
highlightTip: null
});
return;
}
this.debouncedSetState({
highlightSeries: player.gameData,
highlightTip: {
y: player.gameData[player.gameData.length - 1].y,
name: row.pname
}
});
}}
x={d => x(d.x)}
y={d => y(d.y)}
/>

This innocuous component takes in a list of points and builds a voronoi straight into the dom. It’s prop signature is a little bit anomalous compared to other components in react-vis (you need to provide it a scales, extent) so be careful! The other thing worth noting in this component is the use of the debouncedSetState function, which as you’ll note from above, we have to define via:

componentWillMount() {
this.debouncedSetState = debounce(newState =>
this.setState(newState), 40);
}

This function makes use of a lodash function called debounce which prevents a function from being called more than certain frequency (in this case once every 40 ms). Preventing this type of super fast state change is advantageous to us as we don’t want every time little motion the user makes to induce a state change. This would cause jitter and unnecessary redraws! To mask this little delay we add animation by including the animation prop on the LineSeries.

Putting this all together we get:

Our pan-ultimate draft of the visualization

Modulo some small styling touches, that’s it! It operates smoothly and accurately reproduces the interaction found in the original visualization. Pretty dang cool!

One more thing

We’ve done pretty well at emulating the functionality of the original NYT visualization, but now we want to know: can we do better? The mouse over functionality that they present is reasonable, however it’s a little bit clunky and prevents you from mousing over every single player because of overlaps. We would like it if we could streamline this in a sensible and performant way. The answer to this problem, and, really if we’re being honest, most problems, is to make our Voronoi more nuanced.

A naive knee jerk would be to model create Voronoi cells for every data point. While it is totally possible to do this, it would create a lot of cells. If there are 82 games in the season for 750 players, that would be 61,500 distinct cells! A general rule of thumb says that the browser can only handle at most couple thousand SVG elements, so we need to be more clever.

A powerful solution is develop a simplified model of our dataset by simplifying each of our lines. Pleasantly there is a healthy body of work on the subject of line simplification, such as the ever excellent Vladimir Agafonkin/mourner’s simplify-js library. Much of this type of work has arisen because of cartographers interests in making shape preserving simplifications to the ragged edges of coast lines and other irregular geographic bodies. But we’ll put it to a different use.

mourner’s simplify-js demo page, simplifying complex lines.

We will simplify each player-year’s line, so that we get the gist of the their line, without having too much detail. We execute on this idea by putting simplify-js to work on our data with another addition to utils file:

const betterThanCurryLine = simplify(stephCurryRecord, 0.5)
.map(({x, y}) =>
({x, y: y + 20, pname: 'Stephen Curry', year: 2016}));
const SIMPLICATION = 3;
export function buildVoronoiPointsWithSimplification(data) {
return data.reduce((acc, {player, pname, year, gameData}) => {
return acc.concat(
simplify(gameData, SIMPLICATION).map(({x, y}) => ({player, pname, year, x, y}))
);
}, betterThanCurryLine);
}

Being visualizationists it is our first impulse to try to see what this these simplifications look like, and voila:

Visualization of the simplification of our model in preparation for a modest robust voronoi strategy (left) compared to the unsimplified data (right). On the left we that there is an (comparatively) jagged line above all of the others, this line is used to specially tune the character of the resulting voronoi. Curry’s record breaking trajectory plays a big role in the user’s experience of this chart, so it is necessary to provide some extra polish to make sure interactions with that line really shine.

With all of these pieces in place, we are, at last, ready to see the final voronoi in action. Behold:

The user experience of using our line-simplified voronoi technique (left), and a visualization of what’s going on under the hood (right).

Conclusion

Over the course of this article we saw how to reconstruct a publication quality interactive data graphic using react-vis. This involved using a wide variety of interactive visualization techniques including canvas rendering, single pass rendering, debouncing, Voronois, and line simplification as a mechanism for tracing out line series. Not every visualization needs this many optimizations and advanced techniques, however we know a lot about our data and target behavior so we are able to tune it to be just the way we wanted.

In building our visualization we made heavy use of a number of features of react-vis, however the ideas discussed in this article are applicable to any web based visualization framework. You could easily implement these techniques in vanilla d3, semiotic, or any of the huge host of other visualization tools available.

If you want to see the code in context, check out this repo. For additional advanced examples check out the charts section of the react-vis documentation or some of my personal work, like this, or this, or maybe even this.

Happy visualizing!

--

--