
Product segmentation refers to the activity of grouping products that have similar characteristics and serve a similar market.
In Logistics, attention is mainly focused on sales volume distribution, demand variability and delivery lead time.
As a Data Scientist for a Retail company, how can you automate this analysis with Python?
You want to put efforts into managing products that have:
- The highest contribution to your total turnover: ABC Analysis
- The most unstable demand: Demand Variability
In this article, we will introduce simple statistical tools to combine ABC Analysis and Demand Variability with Python.
SUMMARY
I. Scenario
1. Problem Statement
2. Scope Analysis
3. Objective
II. Segmentation
ABC Analysis
Demand Stability: Coefficient of Variation
Normality Test
III. Conclusion
1. Automate the process with Business Intelligence
2. Inventory Management Rules
How to do product segmentation with Python?
Product Segmentation for Retail
You support the Operational Director of a local Distribution Center (DC) that delivers 10 Hypermarkets.
In her scope, there is the responsibility of
- Preparation and delivery of replenishment orders from stores
- Demand Planning and Inventory Management
What are the operational challenges?
Logistics Operations for Retail
This analysis will be based on the M5 Forecasting dataset of Walmart stores’ sales records.
We suppose that we only have the first-year data (d_1 to d_365):
- 10 stores in 3 states (USA)
- 1,878 unique SKU
- 3 categories and 7 departments (sub-category)
Categories and departments do not impact your ordering, picking, or shipping processes except for the warehouse layout.
Code – Data Processing
What does impact your logistic performance?
Products RotationWhat are the references that are driving most of your sales?
- Very Fast Movers: top 5% (Class A)
- The following 15% of fast movers (Class B)
- The remaining 80% of prolonged movers (Class C)
This classification will impact,
- Warehouse Layout: Reduce Warehouse Space with the Pareto Principle using Python
- Picking Process: Improve Warehouse Productivity using Order Batching with Python
Demand VariabilityHow stable is your customers’ demand?
- Average Sales: µ
- Standard Deviation:
- Coefficient of Variation: CV = σ/µ
You may need more stable customer demand for SKUs with a high CV value, leading to workload peaks, forecasting complexity and stock-outs.
Code
- Filter on the first year of sales for HOBBIES Skus
- Calculate Mean, Standard deviation and CV of sales
-
Sorting (Descending) and Cumulative sales calculation for ABC analysis
Now that you have computed the key indicators, let’s generate visualizations.
Methodologies of Product Segmentation
This analysis will be done for the SKU in the HOBBIES category.
How to perform ABC Analysis?
What are the references that are driving most of your sales?

Class A: the top 5%
- Number of SKU: 16
- Turnover (%): 25%
Class B: the following 15%
- Number of SKU: 48
- Turnover (%): 31%
Class C: the 80% slow movers
- Number of SKU: 253
- Turnover (%): 43%
In this example, the Pareto Law (20% of SKUs making 80% of the turnover) is not observed.
However, 80% of our portfolio still makes less than 50% of the sales.
Code
How stable is your customers’ demand?
Define Demand Stability with the Coefficient of Variation
From the Logistics Manager’s point of view, handling a peak of sales is much more challenging than ensuring uniform distribution throughout the year.
To understand which products will present planning and distribution challenges, we will compute the coefficient of variation of each reference’s yearly sales distribution.

Class A
Fortunately, most of the A SKU have a quite stable demand; we won't be challenged by the most important SKUs.

Class B
The majority of SKUs are in the stable area; however we still spend effort on ensuring optimal planning for the few references that have a high CV.

Class C
Most of the SKUs have a high value of CV;
For this kind of reference a cause analysis would provide better results than a statistical approach for forecasting.

Code
Can we assume that the sales follow a normal distribution?
Use the Normality Test to check if a distribution is normal
Most of the simple inventory management methods are based on the assumption that the demand follows a normal distribution.
Why?Because it’s easy.
Sanity CheckVerifying that this hypothesis cannot be refuted before implementing rules and performing forecasts is better.
We’ll use the Shapiro-Wilk test for normality, which can be implemented using the Scipy library.
The null hypothesis will be (H0: the demand sales follow a normal distribution).

Bad News
For an alpha = 0.05, we can reject the null hypothesis for most of the SKUs. This will impact the complexity of inventory management assumptions.
Code
Do you want to try this algorithm without coding?
I have deployed this solution on a web application that helps you to perform automated Pareto Analysis and ABC Classification after uploading your sales data.

Try it,
![Try the App [Link]](https://towardsdatascience.com/wp-content/uploads/2023/06/1hh2WAM-xs3eAjwPrnebatw.png)
You can find the complete code in this GitHub repository, 👇
GitHub – samirsaci/product-segmentation: Product Segmentation for Retail with Python
Conclusion
These analyses give us insights into operational decision-making.
Which references should we prioritize?
As a data scientist, you can help store managers and warehouse and transportation teams streamline their operations by focusing on revenue-generating products.
How can we automate the process?
Automate the process with Business Intelligence.
The whole process can be automated using business intelligence solutions that extract, process and load data for live reporting.

The idea is to track the evolution of your items and adapt your
- Demand planning strategies
- Warehouse layouts
- Inventory management rules
The challenges are data quality and systems integrations.
For more details about business intelligence solutions,
Now that we have gained valuable insights into product segmentation, the next logical step is to leverage these insights for operational improvements.
But what can we do with these findings?
Inventory Management Rules
A critical application is optimizing inventory management.
For most retailers, inventory management systems take a fixed, rule-based approach to forecast and replenishment order management.
How can we use ABC Analysis to manage inventory?
In another article, we explore how to use Python to design inventory management rules based on demand variability.

Combining the descriptive analysis from product segmentation with advanced inventory strategies.
Objectives: minimize the inventory and avoid stock-outs.
You can ensure that your stock levels are aligned with customer demand, reducing waste and improving profitability.
For more details,
About Me
Let’s connect on Linkedin and Twitter. I am a Supply Chain Engineer who uses data analytics to improve logistics operations and reduce costs.
For consulting or advice on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting.
If you are interested in Data Analytics and Supply Chain, look at my website.
💌 New articles straight in your inbox for free: Newsletter 📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet