Python Plotting Basics

Simple Charts with Matplotlib, Seaborn, and Plotly

Laura Fedoruk
Towards Data Science

--

This tutorial will cover the basics of how to use three Python plotting libraries — Matplotlib, Seaborn, and Plotly. After reviewing this tutorial you should be able to use these three libraries to:

  • Plot basic bar charts and pie charts
  • Set up and customize plot characteristics such as titles, axes, and labels
  • Set general graphing styles/characteristics for your plots such as custom font and color choices
  • Understand the differences in use and style between static Matplotlib and interactive Plotly graphics

The data I’m using for these graphics is based on a handful of stories and survey results from the Elephant in the Valley , a survey of 200+ women in tech.

Matplotlib, Seaborn, and Plotly Differences

I’ve heard Matplotlib referred to as the ‘grandfather’ of python plotting packages. It really has everything you’ll likely need to plot your data, and there are lots of examples available on the web of how to use it. I’ve found that it’s drawback is in that its default style isn’t always visually appealing, and it can be complex to learn how to make the adjustments you’d like. Sometimes what seems like it should be simple requires quite a few lines of code.

Seaborn is complementary to Matplotlib and as can be seen from the examples below, it’s built ontop of Matplotlib functionality. It has more aesthetically pleasing default style options and for specific charts — especially for visualizing statistical data, and it makes creating compelling graphics that may be complex with Matplotlib easy.

Plotly is an online visualization library with a Python API integration. After you’ve set up your account, when you create charts they are automatically linked in your files (and public depending on your account/file settings). It is relatively easy to use and provides interactive graphing capabilities that can be easily embedded into websites. It also has good default style characteristics.

Setting up Our Libraries and Data Frame

I’m using Pandas to organize the data for these plots, and first set up the parameters for my Jupyter Notebook via the following imports. Note that the %matplotlib inline simply allows you to run your notebook and have the plot automatically generate in your output, and you will only have to setup your Plotly default credentials once.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.font_manager
%matplotlib inline
import plotly.plotly as py
import plotly.graph_objs as go
plotly.tools.set_credentials_file(username='***', api_key='***')

I went ahead and set up a data frame using pandas. The information we’re graphing is as seen below:

Fig.1 — Sample Responses from www.elephantinthevalley.com

Matplotlib Barchart Example

The following code produces the bar chart seen below using Matplotlib. You can see that we first set up our figure as a subplot with a specified figure size. We then set what we want to be our default text and color parameters for plotting with Matplotlib using the rcParams function which handles all default styles/values. Note that when you use rcParams as in the example below, it acts as a global parameter and you are changing the default style for every time you then use Matplotlib.

In this example I am using a custom color palette which is a list of colors, but it would also be possible (and necessary for grouped bar charts) to use a single color value for each set of data you wanted to use for your bars. Also note that in addition to using hex color codes, you can use the names of colors supported by the library.

We set our overall title, axis labels, axis limits, and even rotate our x-axis tick labels using the rotation parameter.

fig, ax = plt.subplots(figsize = (12,6))plt.rcParams['font.sans-serif'] = 'Arial'
plt.rcParams['font.family'] = 'sans-serif'
plt.rcParams['text.color'] = '#909090'
plt.rcParams['axes.labelcolor']= '#909090'
plt.rcParams['xtick.color'] = '#909090'
plt.rcParams['ytick.color'] = '#909090'
plt.rcParams['font.size']=12
color_palette_list = ['#009ACD', '#ADD8E6', '#63D1F4', '#0EBFE9',
'#C1F0F6', '#0099CC']
ind = np.arange(len(df['Question']))bars1 = ax.bar(ind, df['Percentage of Respondents'],
color = color_palette_list,
label='Percentage of yes responses to question')
ax.set_title("Elephant in the Valley Survey Results")
ax.set_ylabel("Percentage of Yes Responses to Question")
ax.set_ylim((0,100))
ax.set_xticks(range(0,len(ind)))
ax.set_xticklabels(list(df['Q Code']), rotation=70)
ax.set_xlabel("Question Key")

This produces the following output:

Fig.3 — Matplotlib Bar Chart Example

Matplotlib Pie Chart Example

The following code produces the pie chart seen below. Like our bar chart example, we first set up our figure as a subplot, then reset our default Matplotlib style parameters via rcParams. In this case we are also defining our data within the code below vs. taking from our data frame. We are choosing to explode the pie chart sections, hence setting up a variable we are calling explode, and we are setting the color choices to being the first two entries in our color palette list previously defined above. Setting the axes to be ‘equal’ ensures that we will have a circular pie chart. Autopct formats our values as strings with a set number of decimal points. We are also specifying the start angle of the pie chart in order to get the format we want, as well as using pctdistance and labeldistance to place our text.

After we set the title, we are also choosing to use a legend for this chart, and specifying that the legend should not have a frame/visible bounding box, and we are specifically setting the legend location by ‘anchoring’ it using the specified bbox_to_anchor parameter. Useful tip — if you want your legend to live outside your figure, first specify the location parameter to be a particular corner such as ‘upper left’ and then specify the location that you would like to pin the ‘upper left’ corner of your legend to using bbox_to_anchor.

fig, ax = plt.subplots()plt.rcParams['font.sans-serif'] = 'Arial'
plt.rcParams['font.family'] = 'sans-serif'
plt.rcParams['text.color'] = '#909090'
plt.rcParams['axes.labelcolor']= '#909090'
plt.rcParams['xtick.color'] = '#909090'
plt.rcParams['ytick.color'] = '#909090'
plt.rcParams['font.size']=12
labels = ['Bay Area / Silicon Valley',
'Non Bay Area / Silicon Valley']
percentages = [91, 9]
explode=(0.1,0)
ax.pie(percentages, explode=explode, labels=labels,
colors=color_palette_list[0:2], autopct='%1.0f%%',
shadow=False, startangle=0,
pctdistance=1.2,labeldistance=1.4)
ax.axis('equal')
ax.set_title("Elephant in the Valley Survey Respondent Make-up")
ax.legend(frameon=False, bbox_to_anchor=(1.5,0.8))

This produces the following output:

Fig. 4 — Matplotlib Pie Chart Example

Seaborn Bar Chart Example

As can be seen from the following code, Seaborn is really just a wrapper around Matplotlib. In this particular example where we are overriding the default rcParams and using such a simple chart type, it doesn’t make any difference whether you’re using a Matplotlib or Seaborn plot, but for quick graphics where you’re not changing default styles, or more complex plot types, I’ve found Seaborn is often good choice.

fig, ax = plt.subplots(figsize = (12,6))plt.rcParams['font.sans-serif'] = 'Arial'
plt.rcParams['font.family'] = 'sans-serif'
plt.rcParams['text.color'] = '#909090'
plt.rcParams['axes.labelcolor']= '#909090'
plt.rcParams['xtick.color'] = '#909090'
plt.rcParams['ytick.color'] = '#909090'
plt.rcParams['font.size']=12
ind = np.arange(len(df['Question']))color_palette_list = ['#009ACD', '#ADD8E6', '#63D1F4', '#0EBFE9',
'#C1F0F6', '#0099CC']
sns.barplot(x=df['Q Code'], y = df['Percentage of Respondents'],
data = df, palette=color_palette_list,
label="Percentage of yes responses to question",
ax=ax, ci=None)
ax.set_title("Elephant in the Valley Survey Results")
ax.set_ylabel("Percentage of Yes Responses to Question")
ax.set_ylim(0,100)
ax.set_xlabel("Question Key")
ax.set_xticks(range(0,len(ind)))
ax.set_xticklabels(list(df['Q Code']), rotation=45)

Here the only difference is we’re using sns.barplot and the output can be the same:

Fig. 5 — Seaborn Bar Chart Example

Plotly Bar Chart Example

The following code sets up our bar chart using Plotly. We’re importing our libraries, and using the same color palette. Then we are setting up our bar chart parameters, followed by our overall layout parameters such as our title and then we’re using dictionaries to set up how we want parameters such as our axes and fonts. Within these dictionaries we are able to specify sub parameters such as x-axis tick label rotation and y-axis range. We are then creating our figure, feeding it our data and layout, and outputting our file to our Plotly account so that we can embed it as an interactive web graphic.

import  plotly.plotly  as py
import plotly.graph_objs as go
color_palette_list = ['#009ACD', '#ADD8E6', '#63D1F4', '#0EBFE9',
'#C1F0F6', '#0099CC']
trace = go.Bar(
x=df['Q Code'],
y=df['Percentage of Respondents'],
marker=dict(
color=color_palette_list))
data = [trace]layout = go.Layout(
title='Elephant in the Valley Survey Results',
font=dict(color='#909090'),
xaxis=dict(
title='Question Key',
titlefont=dict(
family='Arial, sans-serif',
size=12,
color='#909090'
),
showticklabels=True,
tickangle=-45,
tickfont=dict(
family='Arial, sans-serif',
size=12,
color='#909090'
),
),
yaxis=dict(
range=[0,100],
title="Percentage of Yes Responses to Question",
titlefont=dict(
family='Arial, sans-serif',
size=12,
color='#909090'
),
showticklabels=True,
tickangle=0,
tickfont=dict(
family='Arial, sans-serif',
size=12,
color='#909090'
)
)
)
fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='barplot-elephant-in-the-valley')

This produces the following output:

Fig. 6 — Plotly Bar Chart Example

Plotly Pie Chart Example

By now you’ve likely caught on to how we are formatting and calling the parameters within Matplotlib and Plotly to build our visualizations. Let’s take a look at one last chart — an example of how we can create a similar pie chart to the one above using Plotly.

The following code sets up and outputs our chart. We are specifying our start angle through the rotation parameter, and noting what information should be available when we hover over each component of our pie chart using the hoverover parameter.

labels = ['Bay Area / Silicon Valley', 
'Non Bay Area / Silicon Valley']
percentages = [91, 9]
trace = go.Pie(labels=labels,
hoverinfo='label+percent',
values=percentages,
textposition='outside',
marker=dict(colors=color_palette_list[0:2]),
rotation=90)
layout = go.Layout(
title="Elephant in the Valley Survey Respondent Make-up",
font=dict(family='Arial', size=12, color='#909090'),
legend=dict(x=0.9, y=0.5)
)
data = [trace]fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='basic_pie_chart_elephant_in_the_valley')

This produces the following output:

Fig. 7 — Plotly Pie Chart Example

And that’s it, we’re all done creating and customizing our bar and pie charts. Hopefully this was helpful to you in learning how to use these libraries in a way that allows you to create bespoke graphical solutions for your data.

As a final note, I’d like to mention that I think it’s important to be cautious about using pie charts. While they are considered a ‘basic’ chart type, they often don’t increase the understanding of underlying data, so use them sparingly and only where you know that they provide value in comprehension.

Happy plotting!

Resources:

--

--

Data scientist, mechanical engineer, and sustainability professional. #data #energy #buildings #environment #empathy. Canadian in Silicon Valley.