The world’s leading publication for data science, AI, and ML professionals.

The Bare Minimum Guide to Matplotlib

Read a more detailed version of this post on my personal blog by clicking here

Line charts, scatter plots, histograms, and a bunch of ways to customize them

Read a more detailed version of this post on my personal blog by clicking here.

The quintessential Python library for Data Visualization is Matplotlib. It’s easy to use, flexible, and a lot of other visualization libraries build on the shoulders of Matplotlib. This means that learning Matplotlib will make it easier to understand and work with some of the more fancy visualization libraries.

Getting started

You’ll need to install the Matplotlib library. Assuming you have some terminal at your disposal and you have pip installed, you can install Matplotlib with the following commaned: pip install matplotlib. You can read more about the installation in Matplotlib’s installation guide.

Object-oriented approach

We’ll begin by making a simple scatter chart. To start with we have to import matplotlib though. The plt framework is what we’ll use for Python plotting.

import matplotlib.pyplot as plt
import numpy as np

We also import numpy, so we can easily generate points to plot! Let’s pick some points on the sine function. We choose some x-values and then calculate the y-values with np.sin.

x = np.linspace(-3, 3, num=10)
y = np.sin(x)

Now that we’ve generated our points, we can make our scatter chart! We start by making a Figure object and an Axes object.

fig = plt.figure()
ax = fig.add_subplot()

We can think of the Figure object as the frame, we want to put plots into, and the Axes object is an actual plot in our frame. We then add the scatter chart to the Axes object and use plt.show() to visualize the chart.

ax.scatter(x, y)
plt.show()

This is the gist of it!

Line charts

Here are examples of colours that we can use. We can specify colours in many different ways; hex code, RGB, plain old names.

from scipy.stats import norm
x = np.linspace(-4, 4, num=100)
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.plot(x, norm.pdf(x, loc=-1, scale=1), color="magenta")
ax.plot(x, norm.pdf(x, loc=0, scale=1), color=(0.85, 0.64, 0.12))
ax.plot(x, norm.pdf(x, loc=1, scale=1), color="#228B22")
plt.show()

There are also many predefined linestyles that we can use. Note that without defining colours, Matplotlib will automatically choose some distinct default colors for our lines.

x = np.linspace(-6, 6, num=100)
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.plot(x, norm.pdf(x, loc=-3, scale=1), linestyle="solid")
ax.plot(x, norm.pdf(x, loc=-1, scale=1), linestyle="dotted")
ax.plot(x, norm.pdf(x, loc=1, scale=1), linestyle="dashed")
ax.plot(x, norm.pdf(x, loc=3, scale=1), linestyle="dashdot")
plt.show()

We can also adjust the width of our lines!

x = np.linspace(-2, 9, num=100)
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
for i in range(1,7):
    ax.plot(
        x, norm.pdf(x, loc=i, scale=1), color="black", linewidth=i/2
    )
plt.show()

Scatter charts

For scatter charts, we can change the markers and their size. Here’s an example

x = np.linspace(-4, 4, num=20)
y1 = x
y2 = -y1
y3 = y1**2
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.scatter(x=x, y=y1, marker="v", s=1)
ax.scatter(x=x, y=y2, marker="X", s=5)
ax.scatter(x=x, y=y3, marker="s", s=10)
plt.show()

We can also combine line and scatter charts using the [ax.plot](https://matplotlib.org/3.3.4/api/_as_gen/matplotlib.pyplot.plot.html) function by changing the fmt parameter. The fmt parameter consists of a part for marker, line, and color: fmt = [marker][line][color]. If fmt = "s--m", then we have square markers, a dashed line, and they’ll be coloured magenta.

x = np.linspace(-2, 2, num=20)
y = x ** 3 - x
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.plot(x, y, 'H-g')
plt.show()

Histograms

We can make histograms easily using the ax.hist function.

x = np.random.randn(10000)
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.hist(x)
plt.show()

We can change a lot of things in the histogram to make it nicer – we can even add multiple!

x1 = np.random.randn(10000)-1
x2 = np.random.randn(10000)+1
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.hist(
    x1,
    color='turquoise',
    edgecolor='none',
    bins=50,
    alpha=0.5,
    density=True
)
ax.hist(
    x2,
    color='magenta',
    edgecolor='none',
    bins=200,
    alpha=0.5,
    density=True
)
plt.show()

Legends

Naturally, we’ll want to add a legend to our chart. This is simply done with the [ax.legend](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.legend.html) function.

x = np.linspace(-2, 2, num=100)
y1 = x
y2 = x**2
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.plot(x, y1, color='turquoise', label='First')
ax.plot(x, y2, color='magenta', label='Second')
ax.legend()
plt.show()

Matplotlib will automatically try and find the best position for the legend on your chart, but we can change it by providing an argument for the loc parameter. Also, a common preference is to not have a frame around the legend, and we can disable it by setting the frameon parameter to False. Additionally, Matplotlib lists the elements of the legend in one column, but we can provide the number of columns to use in the ncol parameter.

x = np.linspace(-2, 2, num=100)
y1 = x
y2 = np.sin(x)+np.cos(x)
y3 = x**2
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot()
ax.plot(x, y1, color='turquoise', label='First')
ax.plot(x, y2, color='magenta', label='Second')
ax.plot(x, y3, color='forestgreen', label='Third')
ax.legend(loc='lower center', frameon=False, ncol=3)
plt.show()

Final tips

There are so many quirks and different things you can do with Matplotlib, and unfortunately I cannot provide them all here. However, a few guidelines to get you started:

  1. You save charts with the plt.savefig() function.
  2. There are a bunch of libraries that build on the shoulders of Matplotlib that could be beneficial to the specific chart you’re trying to create, e.g. Seaborn, Bokeh, Plotly, and many more.
  3. Look at the (https://matplotlib.org/stable/gallery/index.html). Please, please, look at the gallery! Don’t waste 3 hours working on a chart, if someone has already made it.

Related Articles