The world’s leading publication for data science, AI, and ML professionals.

Small Steps Towards More Intuitive and Aesthetic Graphics

How minor figure modifications and visualization decisions can result in clearer, more informative plots

Image by author
Image by author

The field of Data Visualization started with the development of fundamental graphical designs by William Playfair in the late eighteenth century and has advanced to the point where trillions of images of statistical graphics are published every year. The importance of good data visualization skills is increasing as the datasets become larger and more complex in the current information age.

Regardless of the targeted use, the most effective designed data graphics are the simplest figures that convey the intended information. Thus, it is important to think about data visualization as both a science and an art in the sense that effective quantitative figures must simply yet accurately convey the data and simultaneously be aesthetically pleasing.

There are plenty of tutorials on how to make general improvements to figures such as manipulating font size, marker size, labels, legends, and colors for clarity. Many of the programming languages and visualization software available today are excellent for generating quantitative graphics but these tools could be better used by implementing simple changes dependent on the data being visualized. Not many people think about how the type of data being plotted can influence visualization decisions for making clearer, more informative graphics for a reader.


Making graphs more intuitive

There are several different ways to improve the appearance of graphics. Visualization decisions such as choosing the color of a line might not seem like they matter much but changing higher temperatures to a red line and lower temperatures to a blue line in a time-temperature plot will likely allow readers to more easily interpret the figure.

Following are four examples of poor and good visualizations on four different datasets with increasing dimensionality. The table below summarizes the type of figures and variables used as examples.

Note that datasets with higher dimensionality are more complex and generally consist of more plotting parameters for visualizing the additional variables.

One-Dimensional Graphics

The gross domestic product data from each continent was used to generate a bar plot for the 1D case. The figure below compares a poor graphic on the left to a good graphic on the right.

Gross domestic product data per continent plotted in the poor graphic at the left and styled graphic at the right. Image by author
Gross domestic product data per continent plotted in the poor graphic at the left and styled graphic at the right. Image by author

The specific modifications made to improve the appearance of the quantitative figure included sorting the data in ascending order, color coding each continent, and adding data labels at the end of each bar.

Two-Dimensional graphics

The 2D figure is a synthetic time-temperature line plot showing monthly temperature throughout the year for a city. The figure below compares a poor graphic on the left to a good graphic on the right.

Synthetic temperature data for a city in North America plotted in the poor graphic at the left and styled graphic at the right. Image by author
Synthetic temperature data for a city in North America plotted in the poor graphic at the left and styled graphic at the right. Image by author

The modifications made to improve the appearance of the quantitative figure consist of changing the line colors to be more intuitive (red being the highs and blue being the lows), shifting the axes and legend so that there is less white space, changing the month numbers to the month names, and adding data labels for each point.

Three-Dimensional graphics

The 3D figure is a synthetic scatter plot showing the height, weight, and body fat percentage for gym members. The figure below compares a poor graphic on the left to a good graphic on the right.

Synthetic height, weight, and body fat percentage data plotted in the poor graphic at the left and styled graphic at the right. Image by author
Synthetic height, weight, and body fat percentage data plotted in the poor graphic at the left and styled graphic at the right. Image by author

The modifications made to improve the appearance of the quantitative figure included changing the lines to a grayscale background better highlighting the representative areas, shifting the axes so that there is less white space, changing the units of weight and heigh from kg and cm to lbs and ft, using a colorbar with an increased color range, and setting limits on the colorbar values.

Four-Dimensional graphics

The 4D figure is a scatter plot showing the price, weight, color, and clarity of approximately 54,000 diamonds. The figure below compares a poor graphic on the left to a good graphic on the right.

Diamond weight, price, color, and clarity data plotted in the poor graphic at the left and styled graphic at the right. Image by author
Diamond weight, price, color, and clarity data plotted in the poor graphic at the left and styled graphic at the right. Image by author

The modifications made to improve the appearance of the quantitative figure included scaling the data to better visualize diamonds with different properties, shifting the axes so that there is less white space, changing the axes to a log scale since the data is more lognormally distributed, adding evenly spaced ticks to the axes, sorting the data based on clarity so that bigger dots plot beneath smaller dots, surrounding each point with a black edge to increase the visual contrast with other data points, adding additional annotations to help the reader identify which diamond color and clarity values are better, and using a colorbar with an increased color range.


Summary

The four examples presented here compared graphics generated using default settings and graphics styled to improve the appearance and create clearer, more informative plots. While specific improvements depend on the type of data being plotted, several generic changes can be made to improve the appearance of graphics such as:

  • Using units that are better understood by more people
  • Varying font and marker size
  • Sorting the data
  • Changing default labels
  • Axes limits, logging axes, color limits
  • Using intuitive colors for appropriate variables

Although there are many more actions that could be taken to improve the appearance of figures, these are often specific to the dimensionality and nature of the data being visualized. It is always worth taking some time to think about the data being visualized as there are several steps that could be taken to modify a figure to generate clearer, more informative plots.


Related Articles