The world’s leading publication for data science, AI, and ML professionals.

Time Series Forecasting with Holt Winters’

A discussion and implementation of the most powerful and useful exponential smoothing model

Photo by Denis Degioanni on Unsplash
Photo by Denis Degioanni on Unsplash

Background

In my recent posts, we have been discussing a very well known family of forecasting models, exponential smoothing. The fundamental principle of exponential smoothing is to put more weight on recent observations and less on historical observations as a means to forecast the time series.

The most basic exponential smoothing model is (funnily) simple exponentially smoothing also known as single exponential smoothing. This model just forecasts the level of the time series and doesn’t take into account trend or seasonality. To learn more about this model, checkout my previous post:

Forecasting with Simple Exponential Smoothing

The next step from this simple model is Holt’s linear trend method, which is also known as double exponential smoothing. Like its name suggests, this model incorporates the trend as well as the level. If you want to learn more about Holt’s method, refer here:

Forecasting with Holt’s Linear Trend Exponential Smoothing

Finally, the next step from Holt’s method is to find a way to include seasonality in the exponential smoothing model. This is where Holt Winters (triple exponential smoothing) comes in!

In this post we will recap over the theory of exponential smoothing, dive into the mathematics of how Holt Winters’ model includes seasonality and lastly go through a real life example in Python.

Holt Winters’ Model Theory

Simple Exponential Smoothing Recap

Let’s quickly go over how simple exponential smoothing works:

For a full explanation on simple exponential smoothing refer to my previous article on the subject here.

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

Where ŷ{t+1}_ is the value we are forecasting, _yt is the most recent observed values, ŷ{t-1}_ is our previous forecast and α is the smoothing factor (0 ≤ α ≤ 1).

This model in component form is:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

Here h is the time step we are forecasting and _l_t = ŷ_{t+1}_ to explicitly state that this is the level component of the model.

Holt’s Linear Trend Method Recap

Holt’s linear trend model incorporates a trend component, into the forecast:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

Here _b_t is the forecasted trend, b_{t-1}_ is the previous forecasted trend and β is the trend smoothing factor (0 ≤ β ≤ 1).

For a full explanation on Holt’s linear trend methoh refer to my previous article on it here.

Holt Winters’

As stated above, the Holt Winters’ model further extends Holt’s linear trend method by adding seasonality to the forecast. The addition of seasonality gives rise to two different Holt Winters’ model, additive and multiplicative.

The difference between the two models is the size of the seasonality fluctuations. For an additive model the seasonality fluctuations are mostly constant. However, for multiplicative model the fluctuations are proportional to the value of the time series at that given time. To learn more about additive and multiplicative time series models, checkout my previous blog post about it:

Time Series Decomposition

Let’s now go over the equations for both Holt Winters’ models:

Additive:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

Where m is the seasonality of the time series, _s_t is the seasonal forecast component, s_{t-m} is the forecast for the previous season and γ is the seasonal component smoothing factor (0 ≤ γ ≤ 1−α_).

If you want to learn more about seasonality, checkout my previous blog post on it here:

Seasonality of Time Series

Multiplicative:

Equation generated by author in LaTeX.
Equation generated by author in LaTeX.

Without going into the nuisances of these equations, what they are trying to do is calculate the trend line for the times series and weight the values on the trend line by the seasonal variations.

Just to note, there is other forms of these equations that contain a dampening parameter. We will not cover these in this article, but the interested reader can learn more about this here.

Enough of all this boring maths, let’s implement the model in Python!

Python Example

We will use the US airline dataset as always and use the ExponentialSmoothing class from the statsmodel library to fit a Holt Winters’ forecasting model.

Data sourced from Kaggle with a CC0 licence.

Plot generated by author in Python.
Plot generated by author in Python.

From the above plot we see that the Holt Winters’ forecast is by far the best as it captures both the trend and seasonality of the time series.

When calling the model we passed parameters seasonal_periods, trend and seasonal to the model obejct. From the above plot, there is clearly a yearly seasonality, so we set seasonal_periods=12. Additionally, the trend is not quite straight line, so it is multiplicative, hence trend='multi'. Finally, the seasonal fluctuations are not some set consistent value but are proportional to the time series, therefore the seasonality is multiplicative seasonal='multi'.

The Holt Winters’ model can further be diagnosed by enacting the summary method:

print(model_holt_winters.summary())
Image by author generated in Python.
Image by author generated in Python.

The smoothing_level, α, and smoothing_seasonal, γ, parameters are relatively high indicating that the seasonality and level components vary frequently. However, the smoothing_trend, β, value is quite small meaning the trend doesn’t vary all that much.

Summary and Further Thoughts

This take us to the end of the exponential smoothing family by discussing probably the most useful model, Holt Winters’. This model takes into account both trend and seasonality components, so can be used to model most time series effectively. As shown by our Python example, it captured the both the seasonality and trend components very well.

Full code used in this article is available at my GitHub here:

Medium-Articles/holt_winters.py at main · egorhowell/Medium-Articles

Another Thing!

I have a free newsletter, Dishing the Data, where I share weekly tips for becoming a better Data Scientist. There is no "fluff" or "clickbait," just pure actionable insights from a practicing Data Scientist.

Dishing The Data | Egor Howell | Substack

Connect With Me!

References and Further Reading


Related Articles