The world’s leading publication for data science, AI, and ML professionals.

Forecasting the Future: How Can We Predict Tomorrow’s Demand Using Yesterday’s Insights?

Demand planners and other company stakeholders use forecasts to anticipate future needs and adjust supply accordingly.

While AI models have taken the spotlight, traditional statistical models remain highly valuable tools for demand forecasting

Photo by petr sidorov on Unsplash
Photo by petr sidorov on Unsplash

Hello Medium readers!

Today, we’ll dive into forecasting techniques applied to demand planning, a field that I’m highly invested in due to my supply chain background and passion for Data Science. Recently, I’ve been reading up on this topic, revisiting books and articles on demand forecasting to provide you with some fresh insights.

To kick things off, let me share a thought-provoking quote by British statistician George E. P. Box:

"All models are wrong, but some are useful."

As you reflect on this quote, you might wonder: why even bother forecasting the future if no model can ever be entirely accurate? Think of it like weather forecasting: it helps us plan ahead. Should I bring an umbrella tomorrow? Should I put on sunscreen? Should I take shelter from a hurricane? Forecasts, while imperfect, guide us in making better decisions.

In demand planning, it’s no different. Demand planners and other company stakeholders use forecasts to anticipate future needs and adjust supply accordingly. The goal is to avoid overproduction or shortages, ensuring customers get what they need without excess waste.

Over the years, many models have been developed to predict demand. With the rise of AI, more sophisticated models have emerged. But as George Box reminds us, all models are wrong. This doesn’t mean they’re useless, it just means that no model can perfectly capture the complexity of reality. Every prediction carries some level of uncertainty.

Photo by John Michael Thomson on Unsplash
Photo by John Michael Thomson on Unsplash

For instance, consider a bookstore. The factors influencing demand are numerous and often hard to define: the store’s location, online presence, reputation, operating hours, and so on. And let’s not forget the human element: customers. Understanding why someone decides to buy a book, when, and how many, involves complex human behavior that is tough to pin down with precision.

That said, models remain valuable. They provide us with a reasonable expectation of what the future holds, even if they aren’t 100% accurate. Of course, there’s always a margin of error, but it can be controlled and minimized as we refine the models over time.

So, how do we forecast future demand? By learning from the past. Today, when people hear the word predictions, they often think of artificial intelligence, especially machine learning models capable of analyzing vast amounts of historical data to predict future trends. However, traditional statistical models are also powerful tools for Forecasting.

In this article, I’ll focus on statistical models and how they’re used in demand forecasting. But don’t worry, my next article will tackle the field of machine learning for predicting future demand!

The Moving Average: A Straightforward Approach to Statistical Demand Forecasting

The moving average model is a simple yet effective forecasting technique that assumes future demand is closely related to the average of recently observed demand. This model works by calculating the average demand over a specified number of past periods (denoted as n), and using that average to predict the future demand. The logic behind this is that recent demand patterns are likely to repeat in the near future.

The moving average forecast for period t+1 (the next period) is calculated as:

Where:

  • n = the number of periods used to calculate the moving average
  • Demand(t-i) = the actual demand observed in period t−i
  • t+1 = the period for which the forecast is being made

Let’s say you are trying to predict demand for a product in the upcoming month based on the last 3 months of data. Suppose the demand over the past three months was as follows:

  • Month 1: 100 units
  • Month 2: 120 units
  • Month 3: 110 units

Using a 3-month moving average:

So, the forecasted demand for the next month (Month 4) is 110 units, based on the average demand from the last 3 months.

While the moving average model is simple and easy to implement, it has three key limitations:

  1. No Trend Sensitivity: It cannot account for upward or downward trends in the data. If demand is steadily increasing or decreasing, the moving average will lag behind, providing forecasts that are less accurate.
  2. No Seasonality: The model does not recognize or adjust for seasonal patterns. For example, if there is a consistent spike in demand during holiday seasons, the moving average won’t reflect this unless additional adjustments are made.
  3. Equal Weighting: The moving average gives equal importance to all the historical periods used in the calculation. This means recent demand is treated no differently than older demand, which may not always be reflective of current trends or shifts in the market.

Simple Exponential Smoothing: A Better Statistical Model for Predicting Future Demand

Simple exponential smoothing is one of the most straightforward methods for forecasting a time series. This model is capable of identifying only the level of the demand from historical data.

💡 Level: The level refers to the average value around which demand fluctuates over time. It represents a smoothed version of demand.

In this model, future demand forecasts are generated based on the most recent estimate of the level. Simple exponential smoothing offers several advantages over naïve or moving average models:

  • Exponential Weighting: The weights assigned to each observation diminish exponentially over time, unlike moving average models that assign equal weight to all observations.
  • Reduced Impact of Outliers: Outliers and noise have a lesser effect compared to naïve forecasting methods.

The fundamental concept behind any exponential smoothing model is that, at each period, the model learns from the latest demand observation while retaining some information from its previous forecasts.

The smoothing parameter, or learning rate (α), determines the importance placed on the most recent demand observation:

Where:

  • α = the learning rate
  • Demand(t-1) = the previous registered demand
  • forecast(t-1) = the previous forecast

The beauty of this formula lies in the fact that the last forecast already incorporates a portion of both the previous demand and the previous forecast. This means the model has learned from historical demand data up to that point.

There is a crucial trade-off between learning and remembering, therefore __ balancing reactivity with stability. A higher alpha value means the model emphasizes recent demand more and reacts quickly to changes, but it also becomes sensitive to outliers and noise. Conversely, a lower alpha will make the model less reactive to changes in demand levels, but it will be more robust against noise and outliers.

Photo by The Nix Company on Unsplash
Photo by The Nix Company on Unsplash

Let’s imagine for a second a retail store that needs to forecast demand for jackets. Demand is highly seasonal with peaks in winter and low demand in summer. They are using the simple exponential smoothing method above to update the forecast each week based on actual demand, adjusting the learning rate parameter α:

  1. High Alpha Value (Emphasizing Recent Data):
  • Suppose the store sets alpha to a high value, such as 0.8. This means the model heavily favors recent data.
  • If there’s a sudden cold wave in early autumn, jacket sales spike. The model quickly increases its demand forecast, adjusting to this sudden change.
  • Advantage: The store can quickly adapt to unexpected events such as a cold wave and ensure they’re stocked up on jackets.
  • Drawback: The model may overreact if there’s an outlier, like a single week of unexpectedly high sales due to a one-time promotion. This sensitivity might lead the store to overstock jackets, only for demand to drop back to normal the next week.
  1. Low Alpha Value (Smoothing Out Data):
  • Now let’s say the store sets alpha to a lower value, such as 0.2. This makes the model less reactive to recent changes, averaging demand over a longer period.
  • In the case of the cold wave, the model doesn’t immediately increase its forecast. Instead, it treats the spike as part of a trend, making only slight adjustments.
  • Advantage: This approach avoids overreacting to short-term events, so the store won’t suddenly overstock based on a single week’s data.
  • Drawback: The store may not react fast enough to an actual shift in seasonal demand, like a prolonged cold period. In this case, they could miss out on potential sales by understocking jackets.

While this statistical model provides flexibility in balancing recent demand with forecast stabilization, the example above also highlights limitations in trend projection, seasonal patterns, and the analysis of external variables.

  • Trend Projection: Simple exponential smoothing does not account for trends. This limitation can be addressed with double exponential smoothing, which incorporates trend components.
  • Seasonality Recognition: The model does not recognize seasonal patterns. This can be remedied with the triple exponential smoothing model.
  • External Variables: It cannot natively incorporate external explanatory variables, such as pricing or marketing expenses.

Double Exponential Smoothing: A Step Up from Simple Exponential Smoothing

A key limitation of simple exponential smoothing is its inability to detect and project trends in the data. This model can only forecast the level of demand.

Photo by m. on Unsplash
Photo by m. on Unsplash

💡 Trend: The trend is defined as the average variation in the time series level between two consecutive periods.

Double exponential smoothing addresses this limitation by not only predicting the level but also forecasting the trend over time. It does so by applying an exponential weight, denoted by Beta (β), to emphasize the importance of more recent observations.

The fundamental principle behind exponential smoothing models is that each demand component – currently the level and trend – is updated after each period based on two key pieces of information: the last observation and the previous estimate of each component.

Assuming that the forecast is the sum of the level a and trend b, the estimation of the level is given by:

The model will update its estimation of the level a(t) at each period, thanks to two pieces of information:

  • Last demand observation d(t)
  • Previous level estimation _a(t-1) i_ncreased by the trend b(t-1)

On the other hand, the estimation of the trend is given by:

However, like any model, the double exponential smoothing still has some limitations:

  • Constant trend beyond historical period:

When the model doesn’t have any new information about how demand might be changing, it will assume that the trend (direction and rate) remains constant from that point forward.

Imagine a store that sells swimsuits. During the summer (historical period), demand increased by 5% each month. At the end of summer (forecasting period), the model will continue predicting a 5% monthly increase, assuming that demand will keep rising. However, as autumn approaches, the actual demand drops sharply. The model’s fixed trend assumption will now significantly overestimate demand, leading to excess inventory and potentially higher costs.

But, of course another version of the double exponential smoothing exists to cover up this issue: Additive Damped Trend Holt’s Linear Exponential Smoothing

For further reading about this statistical model, I suggest the following papers and articles :

> Damped-trend-Modelling.pdf by C. T. Bauer College of Business

> Double Exponential Smoothing | SAP Help Portal

  • Lack of seasonality
  • Impossibility to take into account external factors

Triple Exponential Smoothing: A More Effective Way to Address Seasonality

Triple exponential smoothing is an extension of double exponential smoothing that incorporates both trends and seasonality in time series forecasting. This model is particularly useful for datasets with seasonal patterns, allowing for more accurate predictions by accounting for fluctuations that repeat over specific intervals (e.g., daily, monthly, or yearly).

Photo by Collab Media on Unsplash
Photo by Collab Media on Unsplash

Therefore, key components of the triple exponential smoothing are:

  1. Level (a): The average value around which the data fluctuates.
  2. Trend (b): The long-term progression of the data (increasing or decreasing).
  3. Seasonality (s): The repeating patterns or cycles in the data that occur at fixed intervals.

The triple exponential smoothing model will be the following one:

where:

  • f(t+1) = forecast in next period t+1
  • a(t) = level at period t
  • Φ * b(t) = damped trend at period t
  • s(t+1−p)​: This is the seasonal factor applied to period t+1

The term t+1−p refers to an earlier period within the same seasonal cycle and variable p is the length of the seasonality cycle. For instance:

  • For monthly data with yearly seasonality, p = 12 (since there are 12 months in a year)
  • For weekly data with daily seasonality, p = 7 (as there are 7 days in a week)

The seasonal factor is often calculated based on historical data by examining the recurring patterns. Here’s how it functions:

Amplifying or Reducing the Forecast:

  • The seasonal factor s(t+1−p)​ either amplifies or reduces the base forecast a(t) + Φ * b(t) to reflect higher or lower demand that typically occurs in that specific period
  • For example, if s(t+1−p) > 1, it will increase the forecast, indicating a high-demand season (e.g., holidays for retail)
  • If s(t+1−p) < 1, it will reduce the forecast, indicating a low-demand season (e.g., winter for swimsuit sales).

Aligning with the Seasonal Cycle:

  • The seasonal factor s(t+1−p)​ is selected to match the point within the season’s cycle that corresponds to the forecast period t+1. This ensures that, say, each January is adjusted by the same factor if monthly seasonality is in play, or each Monday is adjusted similarly if weekly seasonality is applied.

For a retail store that experiences seasonal demand:

  • Suppose p=12 (monthly data with yearly seasonality).
  • If you’re forecasting for January 2024 (period t+1), you’d apply the seasonal factor s1 (as January was the first month in the last cycle).
  • If January historically has 20% higher demand than the average, then s(t+1−p) for January will be 1.2, increasing the forecast.

Conclusion

Each forecasting model has its unique strengths and limitations. Statistical models, such as exponential smoothing, offer effective and straightforward solutions for demand forecasting. They are often easier to implement, interpret, and maintain compared to more complex AI/ML models.

However, when forecasting accuracy depends on a higher degree of certainty or when incorporating a broader range of explanatory variables (such as external economic indicators, promotions, or seasonality), ML models may offer advantages.

Ultimately, the best choice depends on our specific KPIs, error metrics, and the complexity of the demand patterns we need to capture.


Stay Tuned!

Coming up next, I’ll explore demand forecasting with machine learning and share some handy Python scripts to get you started. I’ll also provide hands-on examples, especially focusing on triple exponential smoothing. Don’t miss out on these practical tips to level up your forecasting game!

I hope you liked this article! Feel free to leave a clap 👏 , a comment 🗨 ️ or to share 📨 this article with your friends.

Thanks for reading and don’t forget to follow me or subscribe for more articles 🚀


Related Articles