
Creating presentable plots in Python can be a bit daunting. It’s especially so if you are used to making your visualizations using other BI software or even R, where most plots come already prettified for you. Another problem is that there are many ways things can go wrong and ways to resolve the issue will depend on the choices you made for the plot. Here, I will demonstrate a few ways to easily create plots in Python for the various scenarios, and show you how to resolve some of the issues that may arise in each case.
In this post, I will focus on efficiency and share some of the tidbits that will make creating visually appealing plots fast.
Basic Set-up
Pyplot in Matplotlib is a must-have to plot in Python. Other libraries are likely all using Matplotlib as its backend. Seaborn is one example. Seaborn adds some nice functionalities, but these functionalities do create confusion sometimes. Let’s import all our packages first.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
Use %matplotlib inline
to display plots if you are using an iPython platform that allows you to display your plots in the front-end, such as Jupyter Notebook.
Add Style
If you just want something presentable up and running quickly, I highly recommend assigning a plot style. Plot styles instantly apply multiple stylistic elements to your plots and save some troubles. Another reason to assign a style ahead of the time is to keep the overall look consistent throughout. If you use different plot methods (sns, plt, pd
) in your document, you could end up with inconsistent plots.
plt.style.use('plot-style-name-goes-here')
Q. Which style is the best?
I especially love fivethirtyeight
for its visibility and simplicity. Audiences from academia are likely to be more familiar with ggplot
style as it is a popular library in R. See the below image to compare some of the styles.
seaborn
is great, but it does come with too many options, which might be more than what we want at this point. If you want to look at a full list of available style, run plt.style.available
. Going forward, I will use fivethirtyeight
to stylize all my plots.

Q. Do I have to use Matplotlib to plot?
If you are using Pandas, it also comes with some plotting capabilities (but its backend is Matplotlib). It’s handy if you want to just quickly look at the distribution (histogram or density plot) or one-to-one direct relationship between two columns (line or scatter plot).
Note that Pandas plotting does not automatically find the best plot type. Default is always a line plot.
# Pandas plot example - df is a Pandas dataframe
df.column_name.plot(title = 'Title', kind = 'hist')
# Above is identical to below using Matplotlib.pyplot ...
plt.hist(df.column_name)
plt.title('Title')
plt.ylabel('Frequency')
plt.show()
# You can also plot relationship between variables
df.plot(x = 'column1_name', y = 'column2_name', kind = 'scatter')
Now that we have the basic set up, let’s look at different scenarios.
Plotting Frequency/Count Data

Seaborn’s Countplot offers a quick way to display the frequency of each value.
sns.countplot(df.column_name)
# to group
sns.countplot(x = 'column1', hue = 'column2, data = df2)
But things can go very wrong sometimes…

We see a number of problems here, our tick labels on the x-axis are overlapping, and the legend box is in a not so ideal location. Let’s see how to resolve these issues.
Q. How do I rotate the labels?
We can override the settings for x-ticks by using Matplotlib. rotation
indicates a degree to rotate the text and ha
(horizontal alignment) shifts the labels so it aligns on the right side.
sns.countplot(x = 'method', hue = 'number, data = df2)
plt.xticks(rotation = 45, ha = 'right')

Q. How do I move the legend outside?
To move the position of the legend, we need to assign the legend location. We can override the legend setting using Matplotlib. bbox_to_anchor
allows you to set the location of the legend manually. If you just want to put this on the upper right corner of the plot, we can also add location info loc = 'upper right'
.
sns.countplot(x = 'method', hue = 'number, data = df2)
plt.xticks(rotation = 45, ha = 'right')
plt.legend(title = 'Number', bbox_to_anchor = (1, 1))

Q. How can I stack the bars?
The plot looks better, but it’s a bit hard to read. It’ll be more clear if the bars were stacked per method. countplot
has a parameter called dodge
that’s set to True by default. If we set this to False, it will stack the bar plots.
sns.countplot(x = 'method', hue = 'number, data = df2,
dodge = False)
plt.xticks(rotation = 45, ha = 'right')
plt.legend(title = 'Number', bbox_to_anchor = (1, 1))

Q. Order seems random? How can I change the order?
Our plot looks much better but the overall order seems very random. We can manually set the order of plots using countplot
as well. This function can also work as a filter. (While we are at it, let’s remove the x-label and set it as a title too.)
sns.countplot(x = 'method', hue = 'number, data = df2,
dodge = False,
order = ['Radial Velocity', 'Transit', 'Imaging',
'Microlensing', 'Eclipse Timing Variations'])
plt.xticks(rotation = 45, ha = 'right')
plt.legend(title = 'Number', bbox_to_anchor = (1, 1))
plt.xlabel('')
plt.title('Method')

Great! Now our frequency plot looks much better.
Plotting Categorical x Quantitative
You can easily try many different options to plot values of categories using Seaborn’s catplot
. By default, Catplot will be a strip plot, but you can change the option by assigning a kind
parameter to a different plot type, such as box
or violin
. Just to confuse everyone a bit more, you can also plot these categorical plots by directly calling them (e.g. sns.boxplot
or sns.violinplot
) and the available parameters will be different. Let’s try to fix a messy catplot.
# first using catplot
sns.catplot(x = 'year', y = 'distance', hue = 'method',
data = df, kind = 'box')

Oh, no! This time it did put the legend outside, but the x-ticks are again overlapping. The lines also seem to be too thick for the boxplot, and the outlier markers are very big. Lastly, the plot is a bit too narrow. We know how to fix the x-ticks, now let’s fix the other issues.
Q. Lines around the boxplot look strange, they are too thick.
To optimize the linewidth, we can manually set the linewidth of the plot.
sns.catplot(x = 'year', y = 'distance', hue = 'method',
data = df, kind = 'box', linewidth = 0.5)
plt.xticks(rotation = 45, ha = 'right')

Q. Outlier markers on the box-plot are too big.
Now that the outliers seem to be way out of proportion compared to our nice new lines. Let’s also make them smaller. If you want to remove them altogether, you can instead use showfliers = False
.
sns.catplot(x = 'year', y = 'distance', hue = 'method',
data = df, kind = 'box', linewidth = 0.5,
fliersize = 1)
plt.xticks(rotation = 45, ha = 'right')

Q. My plot is too narrow. How can I change the proportion of the plot?
Lastly, the overall plot is looking too narrow. So let’s try to widen the plot area by changing the aspect ratio. The aspect value changes the width holding the height constant.
sns.catplot(x = 'year', y = 'distance', hue = 'method',
data = df, kind = 'box', linewidth = 0.5,
fliersize = 1, aspect = 1.5)
plt.xticks(rotation = 45, ha = 'right')

If you use sns.boxplot
directly, it won’t have the aspect
parameter and we will need to change the aspect ratio by setting the figure size using Matplotlib. fig = plt.figure(figsize = (w, h))
Q. My plots are getting cut-off when saved locally.
Lastly, sometimes plots may have their title or legend cropped when saved locally. To prevent this issue, call plt.tight_layout()
before saving the plot.
Today, we briefly looked at a few tips in setting up a plot in Python using Matplotlib and Seaborn, and how to solve a few of some common issues when using Seaborn. If you are constantly running into any other issues, please leave a comment and I can add it to the post!
Happy New Year!
