
Introduction
Abraham Maslow writes, "I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail".
This is the situation that aspiring data scientists find themselves in when analyzing time series data. The seasonal_decompose function from Python’s Statsmodels library is the hammer, and every time series data is just another nail.
Decomposing our time series is an important step in improving forecast accuracy and creating causal insights.
The seasonal_decompose function is okay for time series decomposition but there are other approaches that are great. And, as data scientists, we always want to use what is great.
Abraham Maslow writes, "I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail".
This article has the following goals:
- Explain the importance of time series decomposition.
- Explain the problems with the seasonal_decompose function.
- Introduce alternative approaches to time series decomposition
Why Should We Decompose Our Time Series Data?
Time series decomposition refers to the method by which we reduce our time series data into its following four components:
- Trend [T]
- Cycle [C]
- Seasonality [S]
- Remainder [R]
1) Trend
The trend of a time series refers to the general direction in which the time series is moving. Time series can have a positive or a negative trend, but can also have no trend.
For example, the GDP growth rate for the United States (and many advanced economies) does not have a trend because economic forces keep the growth rate relatively stable.

2) Cycle
The cycle for time series data refers to its tendency to rise and fall at inconsistent frequencies. We often use the cycle component of a time series to discuss business cycles in economic data.
3) Seasonal
The seasonal component of a time series is similar to its cycle component except for one important difference: the seasonal component refers to data that rises and falls at consistent frequencies.
The tourism industry is well-acquainted with the seasonal component. A country’s tourism industry experiences high revenues in the warmer months, and then its revenues slowly inch towards the precipice of a cliff at the first sign of snow.
4) Remainder
The remainder is what’s left of the time series data after removing its trend, cycle, and seasonal components. It is the random fluctuation in the time series data that the above components cannot explain.
When forecasting, it is advantageous to use a ‘seasonally-adjusted’ time series, which is just a time series with the seasonal component removed. This allows a forecaster to focus on predicting the general trend of the data.
A second reason to use time series decomposition is to identify any interesting behavior in the seasonal component. Then, we can research why our data moves in the way it does.
The Issue With the Usual Approach to Time Series Decomposition
Interestingly, Statsmodels knows that there are better ways to decompose time series data than the usual seasonal_decompose function.
They warn us (emphasis is my own):
This [seasonal_decompose] is a naive decomposition. More sophisticated methods should be preferred – Statsmodels Documentation
Seasonal_decompose uses the classical decomposition method, of which there are two types: additive and multiplicative.
Additive Decomposition
Additive decomposition argues that time series data is a function of the sum of its components. Thus,

where Y is the time series data, T is the trend-cycle component, S is the seasonal component, and R is the remainder.
Rearranging gives us,

Multiplicative Decomposition
Rather than a sum, the multiplicative decomposition argues that time series data is a function of the product of its components. Thus,

And, rearranging gives us,

We can usually identify an additive or multiplicative time series from its variation. If the magnitude of the seasonal component changes with time, then the series is multiplicative. Otherwise, the series is additive.

Notice that the magnitude of the seasonal component – the difference between the maximum point of the series and the red line – is relatively constant from 2011 onward in the additive time series.
However, in the multiplicative series, the magnitude of the seasonal component grows as time increases.
Note: identifying whether a series is additive or multiplicative is trickier than the above image might suggest. Oftentimes, one component of the time series might be additive while the others are multiplicative.
For example, you can reasonably have a time series where

And, thus,

The classical approach to time series decomposition has several issues:
- It uses two-sided moving averages to estimate the trend-cycle. Thus, the first few observations and the last few observations are absent from the trend-cycle.
- It assumes that the seasonal component is constant throughout the entire series. While this may be an accurate assumption for short time periods, this assumption becomes untenable for longer periods. For example, innovations in air travel and other modes of transportation have made fundamental changes in the tourism industry for many economies; and thus, it would be incorrect to assume that its seasonal variation has remained stable throughout its history.
- The trend line over-smooths the data. Thus, it is not responsive to sharp fluctuations. This causes a large remainder component.
Thankfully, there are approaches to time series decomposition that resolve the above issues.
Alternatives to the Classical Approach
X11 Decomposition
X11 Decomposition creates a trend-cycle for all observations. As well, the X11 Decomposition allows the seasonal component to change slowly.
I’m unaware of any Python library that implements the X-11 procedure. However, this can be done relatively easily with the seas function in R’s seasonal package. You might be able to use the rpy2 library to replicate the R code in Python. (I will update this article later this week once I’ve found a work-around).
seas(data, x11 = "")
Let’s compare the results of the X11 decomposition with those of a classical decomposition.
In R:
library(forecast)
library(ggplot2)
autoplot(AirPassengers)

Because the seasonal component increases with time, we know that we should use multiplicative decomposition.
mfit <- decompose(x = AirPassengers, type = "multiplicative")
autoplot(mfit)

Notice that the seasonal component is unchanging, the remainder component has a lot of large values, and the trend line is missing some observations from the beginning and from the end of our data set.
Now, if we use the X11 decomposition,
fit <- seas(x = AirPassengers, x11 = "")
autoplot(fit)

You should notice three things:
- The seasonal component increases with time, thus reflecting the fundamental changes in the airline industry since 1950.
- The remainder component is smaller than was the case when we had done the classical decomposition. This is because our trend line is a good fit for the raw data.
- There are no missing observations in the trend line.
STL Decomposition
One problem with X11 is that it only handles monthly and quarterly data.
Another is that it is not robust to outliers.
Therefore, in 1990, researchers at the University of Michigan and Bell Labs published "STL: Seasonal-Trend Decomposition Procedure Based on Loess".
The STL approach to time series decomposition has the following advantages over the X11 aproach:
- It handles any type of seasonality.
- The user can control the rate of change of the seasonal component.
- It is robust to outliers.
We can implement STL in Python with the STL function.
from statsmodels.tsa.seasonal import STL
We can change the smoothness of the trend-cycle and the seasonal components by passing an integer into the trend and seasonal arguments in the STL function. The seasonal argument is set to 7 by default (it is also recommended that you use a seasonal smoother greater than or equal to 7).
If a trend value is not specified, then Statsmodels calculates a trend value by using the smallest odd integer greater than

The choice of the seasonal smoother is up to you. The larger is the integer, the more ‘smooth’ your seasonal component becomes. This causes less of the variation in your data to be attributed to its seasonal component. Thus, you must decide how much variation in your data can be reasonably attributed to the seasonal component.
The originators of the STL method suggest using a seasonal diagnostic plot, and then experimenting with different smoother values to determine which value seems right. Unfortunately, there is no implementation of this in Python (there is in R, however).
Implementing STL in Python:
from statsmodels.tsa.seasonal import STL
import matplotlib.pyplot as plt
import pandas as pd
df = df[:len(df) - 1] # Removes the last row in the data set
columns = ['Month', 'Passengers']
df.columns = columns
df.head()

df = df.set_index('Month') # Set the index to datetime object.
df = df.asfreq('MS') # Set frequency
# Set robust to True to handle outliers
res = STL(df, robust = True).fit()
res.plot()
plt.show()

The variation in the seasonal component has changed more rapidly than was the case when we had used the X11 decomposition. Because the STL decomposition is also robust to outliers, its estimation of the seasonal component is likely more accurate than the estimation from X11.
Feel free to experiment with the seasonal argument in the STL function. Just make sure that you use an odd integer.
If you’d like to seasonally adjust your data, then subtract the seasonal component from the raw data.
That is,
df['Seasonally Adjusted'] = df['Passengers'] - res.seasonal
df.head()

Conclusion
Now you know alternative approaches to seasonal decomposition. These alternative approaches will give you better estimates of the seasonality and trend-cycle components of your time series than you would have received with Statsmodels’s seasonal_decompose function.
You will also be able to use these approaches to obtain more accurate forecasts, and to better identify interesting patterns in your data set.
Bibliography
[1] Hyndman, R.J., & Athanasopoulos, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. OTexts.com/fpp2.
[2]https://www.statsmodels.org/stable/generated/statsmodels.tsa.seasonal.seasonal_decompose.html
[3] Sutcliffe, Andrew. (1993) "X11 Time Series Decomposition and Sampling Errors", Australian Bureau of Statistics: Melbourne, Australia.
[4] Cleveland, R.B., Cleveland W.S., McRae J.E., & Terpenning, I. (1990) "STL: Seasonal-Trend Decomposition Procedure Based on Loess", Journal of Official Statistics.
[5] Gardner, Dillon R. (2017) "STL Algorithm Explained: STL Part II".
Notes
- The trend term can be non-linear. Also, there are two types of trends: stochastic and deterministic. I will discuss these in a later article.
- The seasonal_decompose function can also estimate a one-sided moving average; however, this causes there to be more missing observations at the beginning of the series than if we had used the two-sided approach, and zero missing observations at the end of the series.