The world’s leading publication for data science, AI, and ML professionals.

5 Steps to Build Beautiful Line Charts with Python

How to use the full capabilities of Matplotlib to tell a more compelling story


GDP Evolution over time of the 5 richest countries - Image by Author
GDP Evolution over time of the 5 richest countries – Image by Author

Motivation

A few months back I wrote an article about bar charts and how you could make them clear, self-explanatory, and visually pleasing to the audience in order to tell a more compelling story (link below).

5 Steps to Build Beautiful Bar Charts with Python

In this article I look into Line Charts instead, which have other specificities that are worth exploring.

Matplotlib makes it quick and easy to plot data with off-the-shelf functions but the fine tuning steps take more effort.

I spent quite some time researching best practices to build compelling charts with Matplotlib, so you don’t have to.

The idea is to go from this…

… to that:

All images, unless otherwise noted, are by the author.


0 The Data

To illustrate the methodology, I used a public dataset containing countries’ GDP information over the past 50 years:

Source: World Bank national accounts data, and OECD National Accounts data files. License URL: https://datacatalog.worldbank.org/public-licenses#cc-by License Type: CC BY-4.0

After importing the necessary packages to read the data and build our graphs, I simply filtered on the Top 20 countries of 2022:

import pandas as pd
import matplotlib.pyplot as plt
from datetime import timedelta

# Read the data
df = pd.read_csv('88a1e584-0a94-4e73-b650-749332831ef4_Data.csv', sep=',')
df.drop(['Series Name', 'Series Code', 'Country Code'], axis=1, inplace=True)
df = df.dropna(subset=['Country Name'])

# Filter on the Top 20 richest countries of 2022
top_20_countries = df[df['Year'] == '2022-01-01'].sort_values('GDP', ascending = False).head(20)['Country Name'].tolist()
df = df[df['Country Name'].isin(top_20_countries)].reset_index(drop = True)

df.head()

The dataset used throughout the article to build the different versions of the line chart is as follows:

Extract of the dataset used in this article - Image by Author
Extract of the dataset used in this article – Image by Author

1 The Basic Plot

To start with, 4 lines of code are enough to create the figure and loop through the countries to plot their respective line:

# Create the figure and axes objects, specify the size and the dots per inches 
fig, ax = plt.subplots(figsize=(13.33,7.5), dpi = 96)

# Plot lines
for country in top_20_countries:
    data = df[df['Country Name'] == country]
    line = ax.plot(data['Year'], data['GDP'], label=country)
The most basic line chart from Matplotlib - Image by Author
The most basic line chart from Matplotlib – Image by Author

2 The Essentials

Let’s add a few vital things to our chart to make it more readable to the audience.

  • Grids To improve its readability the grids of a graph are essential. Their transparency is set to 0.5 so they don’t interfere too much with the data points.

  • X-axis and Y-axis reformatting I voluntarily added more parameters than necessary here to have a more comprehensive view of the fine tuning possibilities. For example, the x-axis did not need a major_formatter and a major_locator object as we are only showing years, but if the reader’s x-axis consists of other figures, then this can come in handy.

  • Legend As we are displaying many lines it’s important to add labels and a legend to figure out which is which.

# Add legend
ax.legend(loc="best", fontsize=8)

# Create the grid 
ax.grid(which="major", axis='x', color='#DAD8D7', alpha=0.5, zorder=1)
ax.grid(which="major", axis='y', color='#DAD8D7', alpha=0.5, zorder=1)

# Reformat x-axis label and tick labels
ax.set_xlabel('', fontsize=12, labelpad=10) # No need for an axis label
ax.xaxis.set_label_position("bottom")
#ax.xaxis.set_major_formatter(lambda s, i : f'{s:,.0f}') #in case we need additional formatting
#ax.xaxis.set_major_locator(MaxNLocator(integer=True)) #in case we need additional formatting
ax.xaxis.set_tick_params(pad=2, labelbottom=True, bottom=True, labelsize=12, labelrotation=0)

# Reformat y-axis
ax.set_ylabel('GDP (Billions USD)', fontsize=12, labelpad=10)
ax.yaxis.set_label_position("left")
ax.yaxis.set_major_formatter(lambda s, i : f'{s*10**-9:,.0f}')
#ax.yaxis.set_major_locator(MaxNLocator(integer=True)) #in case we need additional formatting
ax.yaxis.set_tick_params(pad=2, labeltop=False, labelbottom=True, bottom=False, labelsize=12)
Adding a few essentials features to our chart - Image by Author
Adding a few essentials features to our chart – Image by Author

3 Focus on the story to tell

Now comes the time to highlight what needs to stand out in our graph to tell the story behind it. In this particular case let’s highlight the 5 richest countries and track their GDP evolution over time.

We define specific colors and line styles in dictionaries, and slightly modify our code to plot them separately.

# Color and line style
colors_dict = {'United States': '#014f86', 'China': '#DC0000', 'Japan': '#ff4d6d', 'Germany': '#403d39', 'India': '#6a994e'}
line_styles_dict = {'United States': '-', 'China': '-', 'Japan': '-', 'Germany': '-', 'India': '-'}

# Plot the Top 5 lines
for country in top_20_countries[:5]:
    color = colors_dict.get(country, 'grey')  # get the color from the dictionary, default to grey if not found
    line_style = line_styles_dict.get(country, '-')  # get the line style from the dictionary, default to solid line if not found
    data = df[df['Country Name'] == country]
    line = ax.plot(data['Year'], data['GDP'], color=color, linestyle=line_style, zorder=2, label=country)

# Add legend
ax.legend(loc="best", fontsize=8)

# Plot the rest
for country in top_20_countries[5:]:
    data = df[df['Country Name'] == country]
    line = ax.plot(data['Year'], data['GDP'], color='grey', linestyle=':', linewidth=0.5, zorder=2)
Still the same line chart but a clearer story - Image by Author
Still the same line chart but a clearer story – Image by Author

4 The Professional Look

Adding a few more features to our graph will make it look way more professional. They will go on top of any graphs and are independent of the data we are using in this article. Thanks to the code snippet below, these adjustments will take little to no effort to implement. Author’s advice: save it and re-use it at will. The reader can tweak them to create their own visual identity.

  • Spines The spines make up the box visible around the graph. They are removed, except for the left one which is set to be a bit thicker.

  • Red line and rectangle on top A red line and rectangle are added above the title to nicely isolate the graph from the text above it.

  • Title and subtitle What is a graph without a title to introduce it? The subtitle can be used to further explain the content or even present a first conclusion.

  • SourceA must have, in all charts ever produced.
  • Margin adjustments The margins surrounding the graph area are adjusted to make sure all the space available is used.

  • White background Setting a white background (from transparent by default) will be useful when sending the chart via emails, Teams or any other tool, where a transparent background can be problematic.

# Remove the spines
ax.spines[['top','right','bottom']].set_visible(False)

# Make the left spine thicker
ax.spines['left'].set_linewidth(1.1)

# Add in red line and rectangle on top
ax.plot([0.05, .9], [.98, .98], transform=fig.transFigure, clip_on=False, color='#E3120B', linewidth=.6)
ax.add_patch(plt.Rectangle((0.05,.98), 0.04, -0.02, facecolor='#E3120B', transform=fig.transFigure, clip_on=False, linewidth = 0))

# Add in title and subtitle
ax.text(x=0.05, y=.93, s="Evolution of the 20 Richest Countries GDP over the Past 50 Years", transform=fig.transFigure, ha='left', fontsize=14, weight='bold', alpha=.8)
ax.text(x=0.05, y=.90, s="Focus on the current 5 richest countries from 1973 to 2022", transform=fig.transFigure, ha='left', fontsize=12, alpha=.8)

# Set source text
ax.text(x=0.05, y=0.12, s="Source: World Bank - https://databank.worldbank.org/", transform=fig.transFigure, ha='left', fontsize=10, alpha=.7)

# Adjust the margins around the plot area
plt.subplots_adjust(left=None, bottom=0.2, right=None, top=0.85, wspace=None, hspace=None)

# Set a white background
fig.patch.set_facecolor('white')
Our visual identity applied to the chart making it neater - Image by Author
Our visual identity applied to the chart making it neater – Image by Author

5 The Final Touch

To get to the end result, introduced at the beginning of the article, the only thing left to do is implementing these few extra components:

  • End Point Markers These elements are purely esthetic but add a nice touch to our line chart. We highlight the very last point of each line with markers to make them stand out.

  • Annotations Thanks to the annotate method, we can highlight specific points in our graph and add a comment directly onto it.

# Plot the Top 5 lines
for country in top_20_countries[:5]:
    color = colors_dict.get(country, 'grey')  # get the color from the dictionary, default to black if not found
    line_style = line_styles_dict.get(country, '-')  # get the line style from the dictionary, default to solid line if not found
    data = df[df['Country Name'] == country]
    line = ax.plot(data['Year'], data['GDP'], color=color, linestyle=line_style, zorder=2, label = country)
    ax.plot(data['Year'].iloc[-1], data['GDP'].iloc[-1], 'o', color=color, markersize=10, alpha=0.3)
    ax.plot(data['Year'].iloc[-1], data['GDP'].iloc[-1], 'o', color=color, markersize=5)

# Add some text on the graph
ax.annotate('During the 2000s,nChina began experiencing rapid economic growth,noutpacing all other countries.',
            (data['Year'].iloc[-18], 2000000000000),
            xytext=(data['Year'].iloc[-28]-timedelta(days=500), 18000000000000),
            ha='left', fontsize=9, arrowprops=dict(arrowstyle='-|>', facecolor='k', connectionstyle="arc3,rad=-0.15"))
The final product: the line chart is easily readable - Image by Author
The final product: the line chart is easily readable – Image by Author

6 Final Thoughts

The intent of this article was to share the knowledge gathered here and there to build a more compelling line chart using Matplotlib. I tried to make it as practical as possible with re-usable code snippets.

I am sure there are other adjustments to be made that I did not think of. If you have any improvement ideas, feel free to comment and make this article more useful to all!

This article only focused on line charts, stay tuned for more!


Thanks for reading all the way to the end of the article. Follow for more! Feel free to leave a message below, or reach out to me through LinkedIn / X if you have any questions / remarks!

Get an email whenever Guillaume Weingertner publishes.


Related Articles