How To: Machine Learning-Driven Demand Forecasting

In this non-technical article, I will explain what machine learning is, how it works, and what you can expect from using it when forecasting demand. We will also discuss pitfalls and best practices when launching an ML initiative.

Nicolas Vandeput
Towards Data Science
8 min readOct 11, 2021

--

What is Machine Learning

Usual statistical models apply a set of known relationships to a dataset. For example, exponential smoothing will have its way of estimating the underlying demand level and trend.

On the other hand, machine learning is about letting an algorithm understand a dataset and its underlying relationships on its own.

Statistical models vs. Machine Learning models. Source: my demand forecasting training.

How Does the Machine Learn?

A Machine Learning algorithm will run through a dataset, look at data features, and (try to) pick up any underlying relationship.

When working on a machine learning model, you need to pay attention to two main aspects:

  • The data (features) you give to your model.
  • The hyper-parameters of your machine learning model.

Choosing the correct data to feed to your model is tremendously important. Data scientists shouldn’t be left alone regarding what data to use; everyone should help. When you create your forecasting algorithm, you should ask yourself the following question:

If I had to make a demand forecast for how much products we are going to sell next month, what questions would I ask myself?

As you ask yourself — and your team — this question, you will get a glimpse of the most meaningful information to give to your model.

Here are a few typical answers:

  • What is the current pricing of my product, and did it change over the last months?
  • What is the monthly sales average of my product?
  • Was the product recently out of stock?
  • Are we currently running some promotions?

By providing relevant data to your ML model, it will be able to predict future demand more accurately.

Typical inputs for a Machine Learning model. Source: my demand forecasting training

What to Expect from ML-driven Demand Forecasting?

Many companies expect too much or too little from using a machine learning model to forecast their sales. If you expect and promise too much, top management will get frustrated, and demand planners will become cautious and reluctant to use an overpromised tool. On the other hand, if you expect too little from ML, you will never launch a data science initiative that will prove to have a very high ROI.

How much extra forecasting accuracy can you expect from using machine learning?

Recent forecasting competitions (such as M5 and Corporación Favorita) showed an error reduction of 20 to 60% compared to forecasting benchmarks. The intermarché competition saw a minor reduction, but this is due to the logarithm scale used (see my webinar below about it).

Since 2018, all demand forecasting competitions have been won by machine learning. What are you waiting for?

For more information, see Learnings from Kaggle’s Forecasting Competitions By Casper Solheim Bojer & Jens Peder Meldgaard.

Register here to be informed of future webinars.

Based on my own consultancy experience, usual machine learning projects result in a forecast error reduction ranging from 5% to 20% compared to a moving average.

Usually, this accuracy improvement will be higher as you bring more data. For example, by providing more demand drivers (historical stock levels, promotions, marketing, pricing) or by forecasting at a daily or weekly level.

Machine Learning and Demand Planners

Note that by improving your forecasting baseline accuracy, you will also improve the overall forecasting process accuracy, as your demand planning team will be able to edit the forecast when needed (they should nearly always be able to add some extra accuracy). For example, let’s imagine your current forecast engine reaches an accuracy of 50% and that your team usually can raise it to 55% thanks to their work. By updating the model and using machine learning, you can reach a baseline accuracy of 55%. Then your team might be able to raise it further to 57 or 58%.

Demand planners can always improve a model’s forecast by using information that the model is unaware of (for example, by communicating with your clients).

Machine Learning models won’t make your demand planning team obsolete — but they might reduce their workload.

Pitfalls of using Machine Learning

Launching a machine learning initiative requires paying attention to a few critical aspects. If you miss them, you’re likely to fail to deliver results.

Data Quality

Don’t use sales data! You should forecast demand, not sales.

Bad data will beat a good forecaster. Every time.

Expectations

As discussed above, promise too much, you will disappoint (and face end-users resistance). But, on the other hand, promise too little, and the project won’t get traction.

Bad Process

The objective of demand forecasts is not to be accurate. But to be helpful for your supply chain to take the right decisions.

In short, if you are not forecasting the right thing, there is no point in improving your forecast accuracy. So first, you should make sure that you are forecasting your demand at the right level of aggregation. Then work on improving our model.

For example, many companies forecast demand by month by market. Whereas they need to deploy inventory on a weekly basis from their plants to a few warehouses in the world. It would make more sense to focus on weekly demand forecasting by warehouse than monthly forecasting by market.

First, fix the process. Then, improve the model.

Wrong Metrics

I still see many supply chains using MAPE as a forecasting metric. There is simply no point in running any forecasting improvement until you are sure you are tracking the right metric.

Many supply chains also sell different products over a wide price range. A forecast error on a product worth 1 cent is less important than a similar forecast error on a product worth hundreds of euros/dollars.

I advocate for supply chains to track wMAE (price-weighted MAE) and wBias (price-weighted Bias). Combining these two metrics will allow you to pay attention to the products that matter the most and be sure not to have a biased model.

Best Practices when using Machine Learning to Forecast Demand

Project Management

Gather a team of motivated, open-minded, curious, dedicated members (you’ll need different profiles). Based on my experience, the beginning of the machine learning journey is the most difficult: you need to gather and clean data without promoting any short-term successes. That’s why you’ll need a motivated team that will spend the time necessary to collect relevant data.

You will also have to assess what demand drivers you should use for your model.

📊 External Data. Pay attention that external data might be both expensive and inconsistent. For example, many external providers will share market information a few months later and based on a granularity that won’t match your requirements. Just avoid it.

🌦️ Weather. Weather impacts many supply chains: you will sell more or less depending on the weather. Unfortunately, you can’t predict the weather accurately more than a few days in advance. I usually take the example of ice creams: sales are highly impacted by sunlight, but you can’t predict the weather four weeks in advance to plan your production.

Data Science

When doing a demand forecasting project, I like to follow the steps highlighted below.

Copyright: Nicolas Vandeput (from my Demand Forecasting training)

As you can see, it is crucial to start with a clear objective (in terms of granularity, horizon, and metrics) before beginning to work on data collection and model creation.

You should validate results against a test set that wasn’t used to train the model. For example, keep a few months of demand aside from the dataset you’ll use to train your model. You can then test it over these unseen periods to assess its accuracy.

Project Timeline

Copyright: Nicolas Vandeput (from my Demand Forecasting training)

1️⃣ Data Gathering & Cleaning

👩‍💻 Power Users

In this first phase, you will gather and clean historical demand and demand drivers. Pay attention that getting some demand drivers’ data might take months (and call for time-intensive work). Instead, you might want to go straight to step 2 and try another model later with more data.

✔️Skip this step if you already use relevant data in your current forecasting software

2️⃣ Model Creation

👩‍🔬 Data Scientists

Data scientists will try out different models with different data features until they achieve the desired results.

3️⃣ Model in Production

👩‍💻 Data Engineers

Once you have a working model, you can transfer it from a “manual/local computer” setup to an “automated/cloud” one.

Pay attention that the time invested in moving a working model from a local machine to the cloud (and automate it) might not be worth it. I’ve seen projects losing three months to transfer a working model to the cloud, hoping to save 10-30 minutes of manual work per week.

✔️ Skip this step if you’re ok with running the model manually once a week/month

4️⃣ User Acceptance

👨‍💼 Project Manager

The model accuracy should be tested against future unseen data. This is the only way to assess forecasting quality. Remember to compare the accuracy achieved by your model with the one achieved by a simple benchmark (see article below), your current forecasting engine, as well as your consensus forecast.

Do not hesitate to do a few parallel runs to confirm that the new model works fine.

This article is based on one of my previous webinars. Register here to be informed of future ones.

Q&A

How much extra accuracy can we expect from using machine learning to forecast demand?

Usually, machine learning models beat state-of-the-art forecasting software by 5 to 15%. Better accuracy can be achieved as more data is available (demand drivers).

How to Launch a Proof of Concept (POC)

  1. Gather initial data (you can use your current statistical tool data)
  2. Optimize a model
  3. Do a few trial runs
  4. Success? Implement your solution!

--

--