
A time series dataset is a collection of data that are time-indexed and collected over a period of time. Using a time series, you can plot interesting visualizations that illustrate the change in values of the subject under study over a period of time. One particular type of time series plots is time series boxplot.
A time series boxplot is a useful way to visualize your dataset when you have multiple data points in a specific time interval. For example, you collect the temperature of a location every hourly, over a period of one month. You want to use a boxplot to see how the mean temperature of each day changes, plus the dispersion of the temperatures for each day. This is where time series boxplot helps.
And so in this article, I will walk you through some of the basics of plotting a time series boxplot – from setting up a simple dataset using Pandas Series and DataFrame, to loading a real-life dataset, and show you how to plot time series boxplots based on your requirements.
Plotting the Time Series Boxplot using a Pandas Series
The first simple example I want to illustrate is how to plot using a Pandas Series. First, let’s create a DatetimeIndex
object containing a range of dates:
import pandas as pd
import numpy as np
date_range = pd.date_range(start = "2022-01-01",
end = "2022-02-28 23:59:00",
freq = "H")
Here, date_range
is a DatetimeIndex
object with start date 2022–01–01 00:00:00
to 2022–02–28 23:00:00
. Notice the interval of 1 hour for each item (freq='H'
):
DatetimeIndex(['2022-01-01 00:00:00', '2022-01-01 01:00:00',
'2022-01-01 02:00:00', '2022-01-01 03:00:00',
'2022-01-01 04:00:00', '2022-01-01 05:00:00',
'2022-01-01 06:00:00', '2022-01-01 07:00:00',
'2022-01-01 08:00:00', '2022-01-01 09:00:00',
...
'2022-02-28 14:00:00', '2022-02-28 15:00:00',
'2022-02-28 16:00:00', '2022-02-28 17:00:00',
'2022-02-28 18:00:00', '2022-02-28 19:00:00',
'2022-02-28 20:00:00', '2022-02-28 21:00:00',
'2022-02-28 22:00:00', '2022-02-28 23:00:00'],
dtype='datetime64[ns]', length=1416, freq='H')
You can now create a Pandas Series using the date_range
variable as the index. For the value, let’s use a random number generator:
ts = pd.Series(list(np.random.randn(len(date_range))),
index = date_range)
Here is the content of ts
now:
2022-01-01 00:00:00 -0.869078
2022-01-01 01:00:00 1.742324
2022-01-01 02:00:00 0.937706
2022-01-01 03:00:00 0.366969
2022-01-01 04:00:00 1.841110
...
2022-02-28 19:00:00 0.061070
2022-02-28 20:00:00 0.354997
2022-02-28 21:00:00 1.102489
2022-02-28 22:00:00 -1.299513
2022-02-28 23:00:00 -0.452864
Freq: H, Length: 1416, dtype: float64
You are now ready to plot the time series boxplot using matplotlib and Seaborn:
import matplotlib.pyplot as plt
import seaborn
fig, ax = plt.subplots(figsize=(20,5))
seaborn.boxplot(x = ts.index.dayofyear,
y = ts,
ax = ax)
You will see the Time Series boxplot below:

Plotting the Time Series Boxplot using a Pandas DataFrame
The second example is create a DataFrame with the date_range
object set as the index:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn
date_range = pd.date_range(start = "2022-01-01",
end = "2022-02-28 23:59:00",
freq = "H")
df = pd.DataFrame(
{
'temp':np.random.randn(len(date_range))
}, index = date_range)
df
The dataframe look as follows:

To see the boxplots over each of the days in the 2 months, use the dayofyear
attribute of the DatetimeIndex
type (df.index
):
fig, ax = plt.subplots(figsize=(20,5))
seaborn.boxplot(x = df.index.dayofyear,
y = df['temp'],
ax = ax)
You will see the plot as follows:

Plotting the Time Series Boxplot for Each Month
For the following example, I am going to load the data from a CSV file that you can obtained from the following URL: https://data.gov.sg/dataset/wet-bulb-temperature-hourly.

This data is subject to the Singapore Open Data Licence, available for review at https://data.gov.sg/open-data-licence.
The data contains the hourly wet bulb temperature recorded at the Changi Climate Station.
The Wet Bulb temperature is the temperature of adiabatic saturation. This is the temperature indicated by a moistened thermometer bulb exposed to the air flow. Wet Bulb temperature can be measured by using a thermometer with the bulb wrapped in wet muslin. Source: Temperatures – Dry Bulb/Web Bulb/Dew Pointhttps://www.weather.gov › zhu › dry_wet_bulb_definition
Let’s load the CSV file into a dataframe:
df = pd.read_csv('wet-bulb-temperature-hourly.csv',
parse_dates = ['wbt_date'],
index_col='wbt_date')
df
This is how the dataframe looks like:

The date of the data starts from 1982 to 2022. Let’s see how the temperature varies for each month. For this, we shall use the month
attribute of the DatetimeIndex
type:
fig, ax = plt.subplots(figsize=(12,5))
seaborn.boxplot(x = df.index.month,
y = df['wet_bulb_temperature'],
ax = ax)
The plot is as follows:

As you can see, May (5) seems to be the hottest month for every year.
Plotting the Time Series Boxplot for Each Year
How about the hottest year? For this, we shall use the year
attribute of the DatetimeIndex
type:
fig, ax = plt.subplots(figsize=(24,10))
seaborn.boxplot(x = df.index.year,
y = df['wet_bulb_temperature'],
ax = ax)
You can see that 1998 is the hottest year:

Observe that the x-ticks are a bit squeezy. Let’s rotate it 30 degrees:
fig, ax = plt.subplots(figsize=(24,10))
seaborn.boxplot(x = df.index.year,
y = df['wet_bulb_temperature'],
ax = ax)
_ = ax.set_xticklabels(ax.get_xticklabels(), rotation = 30)
You can now see that the x-ticks are more spaced out:

Plotting the Time Series Boxplot for Each Day in a Specific Month
If you want to know the temperature for a particular month and year, say January 1982, you can perform a filter on the dataframe first before plotting, like this:
fig, ax = plt.subplots(figsize=(24,10))
seaborn.boxplot(
x = df['1982-01-01':'1982-01-31'].index.day,
y = df['1982-01-01':'1982-01-31']['wet_bulb_temperature'],
ax = ax)
Looks like 1 January 1982 is the hottest day in the month:

Plotting the Time Series Boxplot for Each Day of the Year
Finally, if you want to see the temperature readings for a particular year, say 1982, you can use the dayofyear
attribute of the DatetimeIndex
type:
fig, ax = plt.subplots(figsize=(150,10))
seaborn.boxplot(
x = df['1982-01-01':'1982-12-31'].index.dayofyear,
y = df['1982-01-01':'1982-12-31']['wet_bulb_temperature'],
ax = ax)
fig.savefig('temp.jpg')
Because the chart would be pretty big, I have used the savefig()
function to save the chart to file. The chart looks like this:

If you were to zoom in, you will see that the x-ticks are just some running numbers:

You can format the x-ticks using the set_xticklabels()
function:
fig, ax = plt.subplots(figsize=(150,10))
seaborn.boxplot(
x = df['1982-01-01':'1982-12-31'].index.dayofyear,
y = df['1982-01-01':'1982-12-31']['wet_bulb_temperature'],
ax = ax)
ax.set_xticklabels(labels =
df['1982-01-01':'1982-12-31'].index.strftime(
'%Y-%m-%d').sort_values().unique(),
rotation=45, ha='right')
fig.savefig('temp.jpg')
The x-ticks would now be the actual dates:

If you want some fancy formatting for the dates, customize them using the
strftime()
function.
Summary
This is a quick overview of how you can create time series boxplots for data that are time-related. Try out the various combinations to create charts that you need!