The world’s leading publication for data science, AI, and ML professionals.

Machine Learning for Retail Demand Forecasting

Comparative study of Demand Forecasting Methods for a Retail Store (XGBoost Model vs. Rolling Mean)

Photo by NeONBRAND on Unsplash
Photo by NeONBRAND on Unsplash

For most retailers, Demand Planning systems take a fixed, rule-based approach to forecast and replenishment order management.

(1) Demand Planning Optimization Problem Statement - (Image by Author)
(1) Demand Planning Optimization Problem Statement – (Image by Author)

This works well enough for stable and predictable product categories but can show its limits with unstable demand impacted by external factors.

As a data scientist, how can you improve robustness? Provide better forecasts with Machine Learning.

In this article, we will implement a model to forecast the demand for retail stores using machine learning with Python.

This approach uses the M5 Competition Walmart dataset that will be introduced in the first section.

Summary
I. Demand Planning Optimization Problem Statement
  Forecast the demand of 50 retail stores in US
II. XGBoost for Sales Forecasting
  Build a forecasting model using Machine Learning
III. Demand Planning: XGBoost vs. Rolling Mean
  1. Demand Planning using Rolling Mean
  An initial approach using a simple formula to set the baseline
  2. XGBoost vs. Rolling Mean
  What is the impact of Machine Learning on Accuracy?
  3. Product Segmentation for Retail using Python
  Do you need to apply machine learning on all items?
IV. Next steps
  1. Implement Inventory Management Rules
  Combine your forecasting model with Inventory Rules to reduce stockouts
  2. Simulation Model with ChatGPT - "The Supply Chain Analyst"
  Implement analytics products with UI powered by GPT on ChatGPT
  3. Sustainable Approach: Green Inventory Management
  Reduce the carbon footprint of your supply chain with smart inventory rules
  4. Improve the Machine Learning Model
  Features Engineering can help us grab additional points of accuracy

Demand Planning Optimization Problem Statement

Retail Company with 50 Stores

For this study, we’ll take a dataset from the Kaggle challenge: Store Item Demand Forecasting Challenge.

Scope

  • Transactions from 2013–01–01 to 2017–12–31
  • 913,000 Sales Transactions
  • 10 Stores
  • 1,913 days for the training set and 28 days for the evaluation set

What do they sell?

Exploratory Data Analysis

We want to predict sales of 3,049 unique in these ten stores.

M5 Forecasting Competition Dataset - (Image by Author)
M5 Forecasting Competition Dataset – (Image by Author)

They are groups in three different families with sub-categories.

Which external factors do we have on hand?

As you can guess, we can’t have an exhaustive list of the external factors influencing these sales (no one has it).

However, the dataset includes

  • Pricing per reference per store for each period
  • Store Location
  • Transaction Date

With these data on hand, let’s see how we can forecast the demand using Machine Learning with Python.


XGBoost for Sales Forecasting

The initial dataset was used for a Kaggle Challenge, where teams competed to design the best model to predict sales.

The first objective here is to design a prediction model using XGBoost.

This model will optimize our replenishment strategy, ensuring inventory optimization and reducing the number of deliveries from your Warehouse.

Can we enrich this dataset? Yes!

Add Date Features

Using the date, we can use the year, month, day of the week, and additional date variations to capture patterns in the demand variability.

What do we mean by trend?

Daily, Monthly Average for Train

Daily or monthly average sales may impact your future demand.

Therefore, it becomes an interesting additional parameter to add.

What about rolling averages?

Add Daily and Monthly Averages to Test and Rolling Averages

They are the benchmark of your model and can also improve the accuracy of your machine-learning model.

You should always ask yourself.

Is my model better than a moving average?

If not, you have no reason to continue to invest time in building a model that will require resources to be put into production.

Now that we have enriched our models with additional features, let’s verify any correlation between these features and the metric we want to forecast.

Heat Map to check correlation

Pearson Correlation Heatmap - (Image by Author)
Pearson Correlation Heatmap – (Image by Author)

Let us keep the monthly average since it is the most correlated with sales and remove other highly correlated features.

There is no point in keeping features that are correlated to each other.

Clean features, Training/Test Split and Run model

Now that we trained the model, let’s have a look at the results.

Results Prediction Model

Prediction vs Actual Sales - (Image by Author)
Prediction vs Actual Sales – (Image by Author)

Based on this prediction model, we’ll build a simulation model to improve demand planning for store replenishment.

When should we trigger a replenishment?

Replenishment is delivering additional goods to a store to ensure we have the minimum inventory needed to meet customers’ demands.

As an output, we have a dataset with the following features

  • date: Transaction date
  • item: SKU Number
  • store: Store Number
  • sales: Actual value of sales transaction
  • sales_prd: XGBoost prediction
  • error_forecast: sales_prd – sales
  • reply: boolean value for replenishment days (if the day is in [‘Monday’, Wednesday’, ‘Friday’, ‘Sunday’] return True)

Does our model provide added value vs. a benchmark?

This is what we’ll try to figure out in the next section.


Demand Forecasting: XGBoost vs. Rolling Mean

Demand Forecasting using Rolling Mean

Your benchmark method to forecast demand is the rolling mean of previous sales.

Easy to design, deploy and maintain.

At the end of Day n-1, you need to forecast demand for Day n, Day n+1, Day n+2.

  1. Calculate the average sales quantity of the last p days: Rolling Mean (Day n-1, …, Day n-p)
  2. Apply this mean to the sales forecast of Day n, Day n+1, Day n+2
  3. Forecast Demand = Forecast_Day_n + ForecastDay(n+1) + ForecastDay(n+2)
Demand Forecast Using Rolling Mean - (Image by Author)
Demand Forecast Using Rolling Mean – (Image by Author)

Is our model more accurate than the benchmark?

XGBoost vs. Rolling Mean

With our XGBoost model, we now have two methods for demand forecasting.

Let us try to compare the results of these two methods on forecast accuracy:

A methodology using XGBoost and Rolling Mean - (Image by Author)
A methodology using XGBoost and Rolling Mean – (Image by Author)
  1. Prepare Replenishment on Day n-1 We need to forecast replenishment quantity for Day n, Day n +1, Day n+2

  2. XGB prediction gives us a demand forecast Demand_XGB = Forecast_Day(n) + Forecast_Day(n+1) + Forecast_Day(n+2)

  3. The Rolling Mean Method gives us a demand forecast Demand_RM = 3 x Rolling_Mean(Day(n-1), Day(n-2), .. Day(n-p))

  4. Actual Demand Demand_Actual = Actual_Day(n) + Actual_Day(n+1) + Actual_Day(n+2)

  5. Forecast Error Error_RM = (Demand_RM – Demand_Actual) Error_XGB = (Demand_XGB— Demand_Actual)

What is the optimal number of days for the rolling average?

Parameter tuning: Rolling Mean for p days

Before comparing Rolling Mean results with XGBoost.

Let us find the best value for p to get the best performance.

Minimum Error with Rolling Mean - (Image by Author)
Minimum Error with Rolling Mean – (Image by Author)

Results: -35% of error in forecast for (p = 8) vs. (p = 1)

Thus, based on the sales transactions profile, we can get the best demand planning performance by forecasting the next day’s sales using the average of the last 8 days.

Let’s compare it with XGBoost.

XGBoost vs. Rolling Mean: p = 8 days

Error XGBoost vs. Rolling Mean - (Image by Author)
Error XGBoost vs. Rolling Mean – (Image by Author)

Results: -32% error in the forecast by using XGBoost vs. Rolling Mean

Forecast error by (axis-x: Store Number, axis-y: Item Number, axis-z: Error) - (Image by Author)
Forecast error by (axis-x: Store Number, axis-y: Item Number, axis-z: Error) – (Image by Author)

The results are convincing.

However, we should remember that training and maintaining a model with so many references to forecast can be challenging.

Do we have to use Machine Learning to forecast all references?

Product Segmentation for Retail

The answer is no.

With product segmentation, you can group products considering their contribution to the turnover and their demand variability.

ABC Analysis Principle - (Image by Autor)
ABC Analysis Principle – (Image by Autor)

In short, you want to focus on the products with the most turnover and unstable demand.

What about the other products?

These high-importance SKUs are the ones you’ll target for the machine learning model, while the other can be forecasts with fixed rules or simple statistical models.

To learn more about product segmentation, read my article linked below.

Product Segmentation for Retail with Python


Conclusion

With our baseline model, the rolling mean, finding the best parameter p days could reduce forecast error by 35%.

However, we could perform even better using the XGBoost forecast to predict demand for days n, n+1, and n+2, adding 2%.

Does that means we have to switch to Machine Learning?

The answer is "not always".

  • Not for all references based on your product segmentation
  • Not if you don’t have external factors in your dataset

What are the next steps? Implementing inventory management rules.

Implement Inventory Management Rules

In many traditional retailers, inventory management systems take a fixed, rule-based approach to replenishment order management.

We need replenishment policies that minimize ordering, holding and shortage costs.

Combined with our forecasting model, we can help you optimize your inventory, reduce stock-outs, and avoid overstock.

When do you need to replenish your store?

With which quantity?

Example of Inventory Management Rule (Customer Demand / Ordering Quantity/ Inventory On Hand)
Example of Inventory Management Rule (Customer Demand / Ordering Quantity/ Inventory On Hand)

In the chart above, you can see

  • The store demand for a specific SKU (RED)
  • The replenishment quantities (BLUE
  • The inventory On Hand (GREEN)

These policies are based on the Economic Order Quantity and consider the variability of your demand.

Order-Up-To-Level Formula
Order-Up-To-Level Formula

I invite you to look at this series of articles for more information.

  • Start with a simple deterministic model to get familiar with the EOQ

Inventory Management for Retail – Deterministic Demand

  • Move to more complex models adapted to stochastic demand distributions

Inventory Management for Retail – Stochastic Demand

Inventory Management for Retail – Periodic Review Policy

Have you heard about Generative AI?

Simulation Model with ChatGPT – "The Supply Chain Analyst"

Large Language Models like GPT can support the analysis of your inventory management rules and interact with users.

What is the optimal rule to minimize my ordering costs?

I introduce a custom GPT designed to automate Supply Chain analytics tasks in this article.

"The Supply Chain Analyst" - (Image by Author)
"The Supply Chain Analyst" – (Image by Author)

This initial prototype is a proof of concept.

OpenAI’s GPTs can be used with advanced analytics models in Python scripts to create interactive analytics products.

"The Supply Chain Analyst": Inventory Management Module - (Image by Author)
"The Supply Chain Analyst": Inventory Management Module – (Image by Author)

We can imagine users uploading their sales and interacting with the agent to understand how to set an optimal rule.

For more details,

Create GPTs to Automate Supply Chain Analytics

Leveraging LLMs with LangChain for Supply Chain Analytics – A Control Tower Powered by GPT

What about sustainability?

My approach always focuses on maximizing accuracy to optimize your store’s inventory.

The goal is to minimize the costs of ordering, delivering and storing items while meeting customer demand.

This can be coupled with sustainable practices to minimize the environmental footprint of your distribution network.

What would be the impact on CO2e emissions if we reduce the frequency of store replenishments?

Sustainable Approach: Green Inventory Management

This can be defined as managing inventory in an environmentally sustainable way.

(Image by Author)
(Image by Author)

This involves processes and rules reducing the environmental impact of order preparation and delivery.

Low vs. High Frequency Replenishments
Low vs. High Frequency Replenishments

In the example above, we see two different store replenishment approaches.

  • On the left, you replenish less frequently with high quantity delivered.
  • On the right, you replenish more frequently with a low delivery quantity.

Do these two approaches have the same impact on the CO2 emissions of your distribution network?

Of course, they don’t.

In this case study, we discover how to simulate the variation of store replenishment frequency and measure its impact on the overall environmental impact.

Data Science for Sustainability- Green Inventory Management

Can we improve the model with better features engineering?

Improve the model

Yes! I have been working on an improved version of the model, and I share my insights in the article below (with the complete code).

The goal is to understand the impact of adding business features (e.g., price change, sales trend, store closing) on the model’s accuracy.

Machine Learning for Retail Sales Forecasting – Features Engineering


About Me

Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer using data analytics to improve logistics operations and reduce costs.

For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.

If you are interested in Data Analytics and Supply Chain, look at my website.

Samir Saci | Data Science & Productivity

💌 New articles straight in your inbox for free: Newsletter 📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

References

[1] Kaggle Dataset, Store Item Demand Forecasting Challenge, Link


Related Articles