Finding the Right Forecasting Aggregation Level

To set up a perfect demand forecasting process, you need to get four things right: granularity, temporality, metrics, and process.

Nicolas Vandeput
Towards Data Science
8 min readJan 11, 2021

--

Theodor Galle nach Jan van der Straet (Stradanus), Destillierlabor c. 1589 — c. 1593

When it comes to demand forecasting, most supply chains rely on populating 18-month forecasts with monthly buckets. Should this be considered a best practice, or is it merely a by-default, overlooked choice? I have seen countless supply chains forecasting demand at an irrelevant aggregation level — whether material, geographical or temporal. In this article, I propose an original 4-dimensions forecasting framework that will enable you to set up a tailor-made forecasting process for your supply chain. I like to use this framework to kick off any forecasting project.

An accurate forecast is not good enough.

You need a useful one.

The 4-Dimensions Forecasting Framework

Demand Forecasting to Support Decision-Making

Supply chains are living organisms making hundreds — if not thousands — of decisions daily. To make the best possible decisions, you need the right pieces of information. Most supply chains’ decisions rely on demand estimates. Your demand forecast is, therefore, a critical piece of information to make the right decisions. A useful* forecast should allow your supply chain to improve its service level, plan better, reduce waste and overall costs. As the demand forecast is used to trigger specific actions, it should be done at the right aggregation level, tracked with the relevant metrics, and supported by an efficient review process.
*I use the term ‘useful’ and not ‘accurate’ on purpose. A forecast could be accurate but irrelevant to take appropriate decisions.

In short, forecasting demand is always a means to an end, not the end itself.

When setting up a forecasting process, you will have to set it across four dimensions: granularity, temporality, metrics, and process (I call this the 4-Dimensions Forecasting Framework).

We will discuss these dimensions one by one and set up our demand forecasting process based on the decisions you need to make. Once you know the kind of decisions you need to make (e.g., how much to produce, where to deploy inventory, whether to open or close plants), setting up your forecasting process according to this framework should be easy.

4-dimensions forecasting framework (copyright: Nicolas Vandeput)

1. Granularity

You should first work on determining the right geographical and material granularity for your forecast.

  • 🗺️ Geographical. Should you forecast per country, region, market, channel, customer segment, warehouse, store?
  • 📦 Material. Should you forecast per product, segment, brand, value, weight, type of raw material required?

To answer those questions, you have to think about the decisions taken by your supply chain based on this forecast. Remember, a forecast is only relevant if it helps your supply chain to take action.

Let’s discuss a few examples:

  • Let’s imagine you need to decide which products to ship from your plant to your regional warehouses. In that case, it might be a good idea to aggregate demand by warehouses’ regions and forecast demand directly at this geographical level.
    ⚠️ Note that forecasting warehouse demand based on historical orders fulfilled by this warehouse is a bad practice as logistic constraints might have impacted historical shipments (from time to time, a warehouse serves clients from another warehouse's usual region).
    ✅ Instead, you should forecast demand from the geographical region that the warehouse should serve irrespective of the warehouse that actually served these orders. In other words, you should forecast warehouse demand based on what should have been fulfilled from the warehouse if there were no constraints.
  • On the other hand, here is a bad example: many supply chains still forecast demand per country even though they have multiple warehouses serving different areas of the same country. In this case, there is a clear discrepancy between the decisions that need to be made (in which warehouse should we ship the goods) and the information used to make these (we will sell that much in this country). This discrepancy will often result in poor inventory allocation across warehouses.
  • We can imagine many other use cases: if your production process needs to produce products in various specific packagings, you should forecast per packaging. When reviewing your forecast, you should then discuss what impacts the ratio of each packaging type: commercial events, promotions, and so on.
  • If different warehouses (or processes) serve different sales channels, you should forecast them separately. On the other hand, if you only have a single warehouse, you should ask yourself if you really have to make a forecast per region or if a single forecast done at a global level wouldn’t be enough.

2. Temporality

Once you know what granularity level you will be working on, you should pick the right forecasting horizon and temporal aggregation (time bucket). Many supply chains stick to forecast demand 18 or 24 months ahead, although demand planners’ time — and any other person working on the forecast — is limited. You need to pick a limited horizon to focus on.

  • 🗓️ Temporal Aggregation. What temporal aggregation bucket should you use (daily, weekly, monthly, quarterly, or yearly)
  • 🔭 Horizon. How many periods do you need to forecast (one month, six months, two years)?

Again, you should answer these questions by thinking about what your supply chain is trying to optimize/achieve and the lead times involved with these decisions.

Let’s take two examples:

  • Your supplier needs to receive monthly orders three months in advance. You should work with monthly buckets and a horizon of 3 months (M+1/+2/+3). Any forecast beyond M+3 should not be your focus.
  • If you need a forecast to know what goods to ships from your central warehouse to your local warehouses, you should focus on a horizon equivalent to your internal lead time (usually a few days or weeks).

Models and Forecasting Horizon. Statistical models can easily produce forecasts over an infinite horizon. This is not the case with machine learning models. So, you might have to stick to statistical models for long-term forecasting.

3. Metrics 🎯

Usually, practitioners overlook the question of forecasting metrics. Choosing the right metrics for a forecasting process/model is, actually, nothing but straightforward, and it will have profound impacts on the resulting forecasts. Depending on the metric selected, you might give too much importance to outliers (RMSE weakness) or risk a biased forecast (MAE weakness). For a detailed discussion about forecasting KPIs, see my article Forecasting KPIs: RMSE, MAE, MAPE & Bias, here. [1]

Here are a few pieces of advice to choose the right forecasting metrics:

  • Avoid MAPE. Many practitioners still use MAPE as a forecasting metric. It is a highly skewed indicator that will promote underforecasting. Avoid it.
  • Combine KPIs. Often, looking at a combination of KPIs (such as MAE & Bias) will be a good compromise enabling you to track accuracy and Bias while avoiding most traps and pitfalls.
  • Track consistent Bias. If you observe a consistent bias (over/under forecasting) for a specific item, it is an important clue that something is wrong with the model/forecasting process.
  • 💡 Weighted KPIs. In the second edition of my book Data Science for Supply Chain Forecasting, I advise weighting each product (or SKU) in the overall metric calculation based on its profitability, cost, or overall supply chain impact. The idea is that you want to pay more attention to SKUs that matter the most. This is especially relevant as we want to find a metric that supports your supply chain: a good score on your forecasting metric should align with business value.

Beyond the math, it is essential to align the forecasting KPIs to the required material and temporal granularities. For example, suppose you are interested in ordering goods from an overseas supplier with a lead time of 3 months. In that case, you should measure accuracy over a forecasting horizon at months +1,+2, and +3 — or, even better, calculate the cumulative error over three months–instead of merely looking at the accuracy achieved at month+1.

4. Process ⚙️

Now that you know your material and temporal aggregation, horizon, and metrics, you can set up a process. This process should be defined with three specific aspects.

1. Stakeholders. Who will review the forecast?
Bringing different points of view to the table — using various information sources — will help create a more accurate forecast. But this can only be done properly if the review process is done thoroughly (otherwise, be ready to face wars of influence).

2. Periodicity. When do you review the forecast?
Updating your forecast more often might improve its accuracy (as you have fresher data at hand). However, updating it too often might create chaos as you overreact to demand changes and consume too many resources for limited added value.

3. Review Process. How do you review the forecast?
At the core of any forecasting process, there should be a measurement of the forecast value added. Tracking each team member’s value-added will enable you to improve the forecasting process efficiency (and refine the relevant forecasting periodicity and stakeholders).

📖 Forecast Value Added Framework. A forecasting process framework that tracks each team/process step’s added value compared to a benchmark (or the previous team’s input). It was imagined and promoted by Michael Gilliland in the 2010s (see his book here).

Recap

Let’s recap with three examples:

  • Short-term forecast. Let’s imagine you need to decide every week what to ship to your stores. The forecast could be updated every week, with a horizon of a few weeks ahead. The granularity would be SKU per store. As you need to populate the forecast every week, the time to review it will be limited. Henceforth, only a few demand planners should validate it. Black-box machine learning models should typically be preferred here.
  • Mid-term forecast. You want to assess what to produce in the coming months. This is your typical S&OP forecast where you need to gather inputs from many stakeholders (sales, finance, marketing, planners, clients, suppliers). The forecast can be generated (and its accuracy measured) at a global level per SKU and once per month.
  • Long-term forecast. You need to set the budget for the upcoming year. This is a long-term forecast at a very aggregated level (most likely done at a value/revenue level per brand/segmented). To create various scenarios (based on pricing, marketing, new product introduction), you will want to use a causal model where the weight of inputs can be set and discussed. Machine learning models should be avoided as they are a black box and struggle to forecast over the long term due to lack of data.

--

--

Consultant, Trainer, Author. I reduce forecast error by 30% 📈 and inventory levels by 20% 📦. Contact me: linkedin.com/in/vandeputnicolas