The world’s leading publication for data science, AI, and ML professionals.

An Unconventional Yet Convenient Matplotlib Broken_Barh Function And When It Is Particularly…

What it is, how to use and customize, when to use

Image by Author
Image by Author

Despite being very convenient for certain cases of Data Visualization in Python, [broken_barh](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.broken_barh.html) is one of the less known and underrated methods of matplotlib. It’s used for plotting a horizontal sequence of rectangles for different x-ranges and at the same vertical position defined by the y-range. In other words, it makes a "broken" horizontal bar plot, i.e., one with gaps.

The syntax is very simple: the function takes in mainly 2 parameters, xranges and yrange. The xranges parameter represents a list or a tuple of 2-item tuples where the first item of each tuple, xmin, is the leftmost x-position of the corresponding rectangle and the second item, xwidth, is the width of that rectangle. The yrange parameter is a tuple representing the y-position and height for all the rectangles.

A basic plot looks like this:

import Matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(8,2))
plt.broken_barh(xranges=[(1, 1), (3, 2)], yrange=(1, 1))
sns.despine()
plt.show()
Image by Author
Image by Author

Some additional self-explanatory parameters can be used to customize the appearance of the resulting rectangles: color, alpha, edgecolor, linewidth, linestyle (can be 'solid', 'dashed', 'dashdot', or 'dotted'). The hatch parameter is used to fill the rectangles with a pattern and can take one of the following values or their combination as a string: /, , |, -, +, x, o, O, ., *. The label parameter sets a label that will be shown in the legend.

Each of these parameters can be either a single argument (a float or string) applying to all rectangles (e.g. color='red') or a list/tuple of arguments (e.g. color=('red','blue') over which it’s iterated creating an intermittent sequence of rectangles:

plt.figure(figsize=(15,2))
x1 = (1, 1), (2.1, 2), (5, 2), (7.2, 1)
y1 = (1, 1)
plt.broken_barh(x1, y1, 
                color=['slateblue', 'orange'], alpha=0.7, 
                linewidth=4, 
                edgecolor=['black', 'brown', 'green', 'blue'], 
                linestyle='dashed', hatch='/')
sns.despine()
plt.show()
Image by Author
Image by Author

The plot above looks a bit overwhelming, I know 🙂 However, it clearly shows how the additional parameters work.

When the broken_barh function can be of use in the real world?

1. To display ranges

With this method, we can show the ranges of different categories on the same plot, for example:

  • boundary conditions (say, temperature) for different species of animals,
  • rainfall amount during a time period for different geographical locations,
  • salary ranges for various jobs.

Let’s take a look at the last case using a dummy data:

x_ranges = [[(3800, 600)], [(2000, 1000)], [(2700, 800)]]
y_start = 1
colors = ['deepskyblue', 'limegreen', 'magenta']
for i in range(len(x_ranges)):
    plt.broken_barh(x_ranges[i], (y_start+i, 0.5), color=colors[i])    
plt.title('Salary ranges by job ($)', fontsize=25)
plt.xticks(fontsize=14)
plt.yticks(ticks=[1.25, 2.25, 3.25], 
           labels=['Job1', 'Job2', 'Job3'], fontsize=16)
plt.tick_params(left=False)
sns.despine()
plt.show()
Image by Author
Image by Author

In this case, the broken_barh method practically drew a single rectangle for each category.

Attention: when we have to draw only one rectangle for a category, we have to put the corresponding 2-item tuple in a list, not in a tuple.

2. To emphasize some intervals on another graph

The broken_barh method is a good choice when it comes to emphasizing some particular intervals on other graphs. For instance:

  • we have a line plot of temperature fluctuations in a geographical zone during a time period and we would like to show the intervals where the temperature was higher than a certain value,
  • currency rate changes over some value of interest,
  • traffic volume fluctuations when the traffic is more intense than average

The graph below represents exactly the last case: hourly traffic volume fluctuations on the I-94 Interstate highway, from 06.00 till 19.00 on weekdays (the data was taken from this repository, reworked, and slightly modified). We applied the broken_barh method to emphasize the time ranges when the traffic was more intense than average:

import pandas as pd
cars_per_hr = pd.Series([4141, 4740, 4587, 4485, 4185, 4466, 4718, 4801, 4932, 5241, 5664, 5310, 4264, 3276])
# Create a line plot
plt.figure(figsize=(10,5))
plt.plot(cars_per_hr, color="grey")
# Create a broken horizontal bar plot
x = [(0.8, 0.95), (5.65, 6)]
y = (cars_per_hr.mean()-50, 100)
plt.broken_barh(x, y, color ='red')
plt.title('Traffic volume from 06.00 till 19.00 on weekdays', 
          fontsize=25)
plt.xlabel('Time', fontsize=20)
plt.ylabel('Traffic volume, cars/hr', fontsize=20)
plt.xticks(ticks=list(range(14)), 
           labels=list(range(6,20)), fontsize=13)
plt.yticks(fontsize=13)
plt.xlim(0,13)
sns.despine()  
plt.show()
Image by Author
Image by Author

We see that there are two time ranges when the traffic was more intense than average: approximately from 6.45 till 7.40 and from 11.40 till 17.40.

3. To create a Gantt plot

Probably, the best application of the broken_barh method is to create simplified Gantt plots. A Gantt plot is a specific type of bar plot commonly used for visualizing relationships between different activities against time. These activities can include:

  • project management schedule and its current status,
  • tasks to accomplish and their deadlines,
  • ongoing events,
  • vacations.

There are other, more sophisticated ways of creating Gantt plots in Python, such as using Plotly or python-gantt libraries. However, also the broken_barh matplotlib function is good enough for creating a basic yet informative graph. In addition, it takes only a few lines of code to make it.

Let’s illustrate how this method works by the example of a vacation schedule for this summer. We’ll use dummy data for a department of 5 people:

plt.figure(figsize=(20,7))
# Create broken horizontal bar plots for vacation ranges 
x_ranges = [[(0,3), (36,3), (69,12)], [(13,19), (55,5)], [(48,5), (76,12)], [(27,19)], [(0,4), (62,12), (86,2)]]
y_start = 0.75
colors = ['deepskyblue', 'indianred', 'limegreen', 'gold', 'violet']
for i in range(len(x_ranges)):
    plt.broken_barh(x_ranges[i], (y_start+i, 0.5), color=colors[i])
# Create broken horizontal bar plots for weekends 
x_weekend = [(4,2), (11,2), (18,2), (25,2), (32,2), (39,2), (46,2), (53,2), (60,2), (67,2), (74,2), (81,2), (88,2)]
y_weekend = (0, 5.5)
plt.broken_barh(x_weekend, y_weekend, color='tomato', alpha=0.1)
plt.title('Vacation schedule for June-August 2021', fontsize=35)
plt.xticks(ticks= [0, 4, 9, 14, 19, 24, 30, 34, 39, 44, 49, 54, 61, 65, 70, 75, 80, 85, 90],
           labels=['June', 5, 10, 15, 20, 25, 'July', 5, 10, 15, 20, 25, 'August', 5, 10, 15, 20, 25, 30], fontsize=20)
plt.yticks(ticks= [1, 2, 3, 4, 5], 
           labels=['Vladimir', 'Natalia', 'Michael', 'Peter', 'Elena'], fontsize=20)
plt.xlim(0, 92)
plt.ylim(0, 5.5)
plt.axvline(30, color='black')
plt.axvline(61, color='black')
plt.grid(axis='x')
plt.tick_params(left=False, bottom=False)
plt.show()
Image by Author
Image by Author

The plot above looks quite clear even without using any specialized library: we see the employees’ names, the number of vacations and vacation ranges for each employee, the main time landmarks (the beginning and the 5th, 10th, etc. of each month, all the weekends), the periods of overlapped vacations or those when nobody took a vacation.

Conclusion

In this article, we explored the syntax of a rarely used yet convenient broken_barh matplotlib method, the ways of customization of the resulting plots, and especially the situations when this method can be of particular use, namely for displaying the ranges of different categories on the same plot (boundary conditions, salary ranges), emphasizing some intervals on another graph (currency rate changes, traffic volume fluctuations), and creating Gantt plots (visualizing schedules such as project steps, tasks, or vacations).

Thanks for reading!


Related topics:

How To Fill Plots With Patterns In Matplotlib

Bar Plots: Alternatives & Specific Types

Creating Toyplots in Python 🧸Hi_gh-quality minimalist interactive visualizations ideal for electronic publishingme_dium.com


Related Articles