The world’s leading publication for data science, AI, and ML professionals.

A layman’s guide to plot with python and matplotlib

Do you want to learn visualizing data in python in less than 10 minutes? Grab a coffee and continue.

Photo by Luke Chesser on Unsplash
Photo by Luke Chesser on Unsplash

Break the ice with matplotlib during a 10-minute coffee break!

Data visualization is an essential, routine task in any field regardless of how small or large the data is, then be it plotting a simple sinusoidal curve or creating sophisticated, interactive dashboards for stakeholders. Python being one of the most sought after programming languages today (in the active top 3 on Stack Overflow), offers tremendous possibilities of data visualization.

Matplotlib is one such powerful plotting library for python that offers a varied spectrum of 1D, 2D, and 3D visualizations. Being a gold badge holder of matplotlib on Stack Overflow, I thought of creating a series of tutorials aimed at teaching the basic know-hows of using python & matplotlib for plotting eye-catchy figures, providing ready-to-use, Jupyter Notebooks.

This first post of mine will teach you the essential basics of matplotlib, taking the use case of one-dimensional plots, together with basic bar and pie Charts. The whole Notebook can be downloaded/forked from my GitHub repository. Following system settings are used: Python version: 3.7.7, Matplotlib version: 3.1.3. The tutorial is expected to work on older versions, too.

1) Let’s get started with 1D plots

To enable the inline plotting in Jupyter Notebooks, you need to use the following at the beginning of the Notebook

%Matplotlib inline

If your screen supports retina display, you can enhance the resolution of the figure by using

%config InlineBackend.figure_format = 'retina'

Matplotlib offers several cool style sheets to enhance the figure aesthetics. I will use 'fivethirtyeight'

plt.style.use('fivethirtyeight')

1.1) The simplest plot

The simplest example would be to plot a list of values. Let’s create a list of incremental values and plot it.

values = [5, 6, 7, 8, 9]
plt.plot(values)
A simple line plot
A simple line plot

You can see that the values are plotted on the y-axis. For Plotting on the x-y space, you typically need two lists: one for the x-values and the other for the y-values. Please note that, by default, solid lines are used for plotting.

Now you might be wondering how the figure was created without passing the x-values.

By default, when we pass a single list to plt.plot(), the x-axis assume integer values starting from zero until one less than the length of the y-values. In the above figure, values had five elements. So the x-values became 0, 1, 2, 3, 4 (because indexing starts from 0 in Python). In other words, the x-values are range(len(values)), where len(values) returns the length of the list i.e. 5. Therefore, the following line will also plot the same figure:

plt.plot(range(len(values)), values)

NOTE: If you are using some other python IDE, for instance spyder, you will need to use plt.show() after the plt.plot() command for the figure window to appear. You don’t need it in the Jupyter Notebook though.

1.2) Let’s plot some functions, 𝑦 =𝑓(𝑥)

We will plot the following two functions for 𝑥∈(0,3𝜋):

  • y = sin(x)
  • y = cos(x)

The NumPy package comes in handy for performing vectorized operations:

import numpy as np

Next, we will define the 𝑥-values and compute the corresponding y_-v_alues:

x = np.linspace(0., 3*np.pi, 100) # 0 to 3*Pi in 100 steps
y_1 = np.sin(x) 
y_2 = np.cos(x)

Let’s now plot both the functions in the same figure. In a 1D plot, the two key elements are lines and markers (symbols). The properties of them both can be customized while plotting. In the code snippet below, I will use some necessary properties (with selected options) that must be known. The parameters passed to the plt.plot() command will be explained later in this post.

Sinusoidal and cosinusoidal curves in the same figure.
Sinusoidal and cosinusoidal curves in the same figure.

1.3) Object-Oriented Approach to Plotting

Now let’s use an object-oriented approach to be able to manipulate the properties of the figure. Among others, there are the two following possibilities:

Method 1

fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(111)

Method 2

fig, ax = plt.subplots(figsize=(8,6))

I will demonstrate the use of Method 2 (my favorite) below:

Sinusoidal and cosinusoidal curves using the above object-oriented approach.
Sinusoidal and cosinusoidal curves using the above object-oriented approach.

2) Customizing the plots

2.1 Choosing line styles and markers

In both the above approaches, I used a solid line and circular markers. The following section will teach you how to choose different line styles and markers.

1. Line styles: You can either explicitly use the keyword linestyle='-' or the abbreviation ls='-'. Following are the few available line styles:

  • '-' : solid line
  • '--': dashed line
  • '-.': dashed-dotted line
  • '.' : dotted line

2. Markers: You can use the keyword marker='o'. Following are the few available markers:

  • 'o': circle
  • 's': square
  • 'p': pentagon
  • 'h': hexagon
  • '^': up-triangle
  • 'v': down-triangle
  • 'x': crosses

2.2 Choosing Colors

1. Colors: You can either explicitly use the keyword color='g' or the abbreviation c='g'.

  • Standard colors can be used as single letters: 'r', 'g', 'b', 'k', 'o' for red, green, blue, black, orange etc.
  • You can also specify the complete name as 'red', 'green', 'orange' for the standard colors as well as other non-standard colors such as 'navy', 'darkorange', 'indigo' .
  • The complete list of allowed colors can be found here.

Trick: You can combine both the linestyle and the color by using the short hand notation of '-r', '--g', -.k etc. However, you cannot use something like linestyle='-.r' or ls='-k' because now you are specifying the properties for line styles, but assigning both the line style and the color together.

BONUS: Combining line and markers⇒ If you want to use both lines and markers, there are two possibilities. Suppose you want a red dashed-dotted line with a square shaped marker. You can use either of the following two options:

  • '-.rs'
  • '-.r', marker='s'

2.3 Controlling properties of lines and markers

You now know, from the previous section, which line styles and markers you can use. The following will teach you how to customize their properties.

1. Lines:

  • lw stands for line width, e.g.: lw=2. You can also use the keyword linewidth=2.
  • alpha controls transparency (0 denotes completely transparent and 1 denotes opaque). You can choose any float value between 0 and 1.

2. Markers:

  • ms stands for marker size, e.g.: ms=10. You can also use the keyword markersize=10.
  • mfc stands for marker face color, e.g.: mfc='green'. You can also use the keyword markerfacecolor='green'.
  • mec stands for marker edge color, e.g.: mec='red'. You can also use the keyword markeredgecolor='red'.
  • mew stands for marker edge width, e.g.: mew=5. You can also use the keyword markeredgewidth=5.
  • alpha controls transparency (0 denotes completely transparent and 1 denotes opaque). You can choose any float value between 0 and 1.

3) Scatter plots

So far we used the plot module, where the two different curves were plotted using lines and markers, respectively. Sometimes, you would want to have a scatter plot. This can either be achieved indirectly using markers as you saw for the cosine curve above or directly via a scatter plot.

⇒ In theplt.plot(), you can pass only the y-values and the x-values will be automatically generated. However, in the scatter plot, you need to pass both x and y-values.

Following are some essential, characteristic parameters of the scatter plot:

  • marker: to choose the scatter style. You cannot use the abbreviation m in this case.
  • s: to define the marker size.
  • c: to define the scatter color. You can also use the keywordcolor.
  • edgecolor: to define marker’s edge color. You can also use the keyword ec.
  • facecolor: to define the color of the marker. You can also use the keyword fc. It does the same thing as c. However, if you first specify c='red' and then additionally specify facecolor='blue', the initially set red marker will now be superseded by the blue color.

Similar to plt.plot(), for standard colors, you can either use their full names (red, blue, green etc.) or just their first letters 'r', 'g', 'b' etc.

You can play around with theTrue and False values in the ax.legend() to see how it affects the legend box.

⇒ You can read the official documentation of scatter plots for more details.

fig, ax = plt.subplots(figsize=(8,6))
ax.scatter(x_1, y_1, color='red', marker='s', edgecolor='black', s=50, label='$f(x)$=sin($x$)')
# 'fancybox=False' produces a legend box with sharp edges.
ax.legend(loc='best', frameon=True, fancybox=False)
ax.set_title('A simple scatter plot');
A scatter plot showing sinusoidal curve.
A scatter plot showing sinusoidal curve.

4) Bar charts

4.1) Vertical bar chart (aka column chart)

⇒ I will use the object-oriented approach. You can use plt.bar() as well. Following are some essential parameters passed to the bar plot:

  • width: width of the bar.
  • edgecolor: to define the edge color of the bars. You can also use ec.
  • linewidth: to define the width of the edges of the bars. You can also use lw.
  • linestyle: to define the style of the edges. You can also use lw. The same styles can be used as specified earlier for the line plots.
  • color: to define the color/colors of the bars. You cannot use the abbreviation c in this case.
  • hatch: to define the hatch/fill style of the bars. You can use styles like o, *, /, //, , -, +, |. The hatches will also have the same color as the edges.

⇒ You can read the official documentation of bar charts for more details.

Let’s define some sample data, for example grades of 5 students in a class and plot a vertical bar chart.

names = ['Tom', 'Mark', 'Sam', 'Jim', 'Henry']
grades = [79, 85, 49, 98, 65]
fig, ax = plt.subplots(figsize=(8,5))
ax.bar(names, grades, width=0.5, hatch='/', edgecolor='black', ls='--', linewidth=3, color='orange')
ax.set_ylabel('Grades')
ax.set_title('A basic bar chart');
A vertical bar chart displaying grades of five students in a class.
A vertical bar chart displaying grades of five students in a class.

⇒ If you want each bar in a specific color, you can pass a list of the desired colors. If the number of elements in the passed colors list is less than the number of bars, the color cycle will repeat itself.

For example, as shown below, if you have five bars but you choose color=['r', 'b', 'g'], the first three bars will respectively be red, blue, green and the remaining two bars will be red and blue, respectively.

fig, ax = plt.subplots(figsize=(8,5))
ax.bar(names, grades, width=0.5, hatch='/', edgecolor='black', linewidth=3, color=['r', 'b', 'g'])
ax.set_ylabel('Grades');
A vertical bar chart with custom colors.
A vertical bar chart with custom colors.

4.2 Horizontal bar chart

For plotting a horizontal bar chart, you just need to use barh instead of bar.

Most importantly, you have to replace the width of the bars by the keyword height.

fig, ax = plt.subplots(figsize=(8,5))
ax.barh(names, grades, height=0.5, hatch='/', edgecolor='black', linewidth=3, color=['r', 'b', 'g'])
ax.set_xlabel('Grades');
A horizontal bar chart displaying grades of five students in a class.
A horizontal bar chart displaying grades of five students in a class.

5) Pie charts

⇒ I will use the object-oriented approach. You can use plt.pie() as well.

Following are some essential, characteristic parameters of the pie chart:

  • labels: name labels for each wedge of the pie chart.
  • shadow: to enable the shadow effect. It can either be True or False.
  • labeldistance: to control the distance of thelabels radially away/close from/to the pie chart.
  • autopct: to automatically calculate the percentage distribution and label each wedge with the corresponding value in the defined format. For example, '%.0f%%' will display the percentages as integer values, '%.2f%%' will display the percentages up to two decimal places.

⇒ You can read the official documentation of pie charts for more details.

names = ['Tom', 'Mark', 'Sam', 'Jim', 'Henry']
shares = [15, 13, 46, 22, 30]
fig, ax = plt.subplots(figsize=(8,8))
ax.pie(shares, labels=names, autopct='%.2f%%', shadow=True, labeldistance=1.1)
ax.set_title('Distribution of shares', fontsize=18, y=0.95);
Pie chart distribution of share holders.
Pie chart distribution of share holders.

5.1) Styling the pie chart

If you wish to highlight the person having the highest contribution, you can split the biggest wedge from the pie chart using the option explode.

You need to specify the value of explode that determines the distance of the splitted wedge from the pie chart’s center. Since we have five people, we need to specify a list containing five explode values, e.g. explode = [0, 0, 0.15, 0, 0]. You can also specify finite values for all the elements, e.g. explode = [0.1, 0.2, 0.3, 0.4, 0.5]. The following example illustrates this point.

⇒ Here I am manually specifying the value 0.15 as the third element of explode list because the highest percentage is for "Sam" – the third element of the names list.

fig, ax = plt.subplots(figsize=(8,8))
explode = [0, 0, 0.15, 0, 0]
ax.pie(shares, labels=names, explode=explode, autopct='%.2f%%', shadow=True, labeldistance=1.1)
ax.set_title('Distribution of shares', fontsize=18, y=0.95);
Pie chart distribution of share holders highlighting the highest distribution.
Pie chart distribution of share holders highlighting the highest distribution.

Related Articles