Improve Warehouse Productivity using Spatial Clustering with Python

Improve Warehouse Picking Productivity by Grouping Orders in Batches using Picking Location Spatial Clusters.

Samir Saci

Follow

Published in

Towards Data Science

8 min readAug 11, 2020

--

Improve Warehouse Productivity using Spatial Clustering with Python Scipy — (Image by Author)

This article is part of a series about Warehouse Operations Optimization with Python. (Part 1)

In the first article, we built the basis to estimate the total picking route walking distance for a set of orders using:

Warehouse Mapping: Link each order line with the associated picking location coordinate (x, y) in your warehouse
Distance Calculating: function calculating the walking distance from two picking location

(8) Results for 5,000 order lines with a ratio from 1 to 9 orders per route — (Image by Author)

As you can see on the chart above, the impact of grouping orders in waves on the total walking distance is important.

Up to 50% reduction after grouping three orders per wave
We reach a 75% reduction if we achieve 9 orders per wave

For the next steps, we decided to take a simple approach for

Picking Route Design: Given a choice of several picking locations, the warehouse picker will always choose to go to the closest (Next Closest Location Strategy)
Order Waving: Orders are ordered and grouped in waves by receiving time from OMS (TimeStamp)

Two levers for improving our solution performance — (Image by Author)

In this article, we will deep dive into the Order Wave Processing solution, focusing on using spatial clustering to group orders.

💌 New articles straight in your inbox for free: Newsletter
📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

I. Order Wave using Picking Locations Clustering
Group orders by geographical clusters of the picking locations
1. Picking locations clustering using Scipy
Apply clustering techniques on the picking locations to create groups
2. Picking locations clustering for Multi-line Orders
Centers of gravities can be used if we have multiple locations in an order
II. Model Simulation
1. Comparing 3 methods of Wave Processing
2. Tuning Distance Threshold for Clustering
III. Conclusion
1. Advanced diagnostic using process mining
2. Warehouse Layout Optimization using Pareto Analysis
3. Next Steps

You can find the full code in my GitHub repository: Link

💡 Follow me on Medium for more articles related to 🏭 Supply Chain Analytics, 🌳 Sustainability and 🕜 Productivity.

I. Order Wave using Picking Locations Clustering

Single-line orders have the advantage of being located in a single storage location; grouping several single-line orders by cluster can ensure that our picker will stay in a delimited zone.

Where single-line orders are located?

(2) Order Lines DataFrame — (Image by Author)

Function: Calculating the Number of single-line orders per storage Location (%)

Code

(1) Distribution of single-line orders lines per storage location — 5,000 order lines (%)

Insights: Let us take the example of Distribution above

Scope: 5,000 order lines for 23 aisles
Single line orders: 49% of orders located in alleys A11, A10, and A09

1. Picking locations clustering using Scipy

(3) Order Lines Processing for Order Wave Picking using Clustering by Picking Location — (Image by Author)

Idea: Picking Locations Clusters
Group picking locations by clusters to reduce the walking distance for each picking route. (Example: the maximum walking distance between two locations is <15 m)

Spatial clustering is the task of grouping a set of points so that objects in the same cluster are more similar than objects in other clusters.

(4) Example of three Picking Locations Clusters — (Image by Author)

Here, the similarity metric will be walking distance from one location to another.

For instance, I would like to group locations, ensuring the maximum walking distance between two locations is 10 m.

1| Challenge 1: Euclidian Distance vs. Walking Distance

We cannot use conventional clustering methods using Euclidian Distance for our specific model. Indeed, walking distance (using the distance_picking function) differs from Euclidian Distance.

(5) Euclidian vs. Custom Distance Example — (Image by Author)

For this specific example, Euclidian distances between i (xi, Yi) and the two points p (x_p, y_p) and j (x_j, y_j) are equal. But, if we compare picker Walking Distance, p (x_p, y_p) is closer.

Picker's Walking Distance is the specific metric we want to reduce for this model.

Therefore, the clustering algorithm should use our custom-made distance_walking function for better performance.

Example: Locations Clustering within a 25 m distance (5,000 order lines)

(6) Left [Clustering using Walking Distance] / Right [Clustering using Euclidian Distance] — (Image by Author)

The left example using Walking Distance is grouping locations within the same aisle, reducing picking route distance;

The right example can group locations covering several aisles.

2 | Function: Clusters for Single Line Orders using Walking Distance

For a set of orders, lines extract single lines (df_orderlines) orders and create clusters of storage locations within a distance (dist_method) using the custom distance function (dist_method).

The Python code below uses Scipy’s ward and fcluster functions to create cluster-picking locations using the distance_func metric (walking distance).

Code

3| Function: Single Line Orders Mapping with ClusterID

For a set of orders, lines extract single lines (df) orders, clusters id and orders number.

In this function, you map your Dataframe with cluster ID for wave creation.

Code

2. Picking locations clustering for Multi-line Orders

Unlike single-line orders, multi-line orders can cover several picking locations.

1 | Function: Centroid for every multi-line order

However, we can apply the same methodology to the centroids of storage locations.

Example: Order with 3 lines covering 3 different picking locations

(7) Centroid of three Picking Locations — (Image by Author)

Code

After using this function, we return to the mono-line orders situation with a single point (x, y) per order.

We can then apply clustering to these points, trying to group orders per geographical zone with maximum distance conditions.

Before looking at complex algorithms, we can find insights on optimising our algorithm with simple solutions.

To help operators find their way, your operations can use voice-picking

II. Model Simulation

To sum up our model construction, see the chart below.

(8) Model Construction with Parameters — (Image by Author)

We have several steps before Picking Routes Creation using Wave Processing.

At each step, we have a collection of parameters that can be tuned to improve performance:

1. Comparing 3 methods of Wave Processing

(9) Three Methods for Wave Processing — (Image by Author)

We’ll first assess the impact of Order Wave processing by clusters of picking locations on total walking distance.

We’ll be testing three different methods.

Method 1: we do not apply clustering (i.e Initial Scenario)
Method 2: we apply clustering on single-line orders only
Method 3: we apply clustering to single-line orders and centroids of multiline orders.

Scenario for Simulation

Order lines: 20,000 Lines
Distance Threshold: Maximum distance between two picking locations (distance_threshold = 35 m)
Orders per Wave: orders_number in [1, 9]

(10) Test 1: 20,000 Order Lines / 35 m distance Threshold — (Image by Author)

Results

Best Performance: Method 3 for 9 orders/Wave with 83% reduction of walking distance
Method 2 vs. Method 1: Clustering for mono-line orders reduces the walking distance by 34%
Method 3 vs. Method 2: Clustering for mono-line orders reduces the walking distance by 10%

2. Tuning Distance Threshold for Clustering

We validated our first assumption that Method 3 is the best for our particular scenario (20,000 order lines, 35 m Distance Threshold).

Let us look at the Distance Threshold impact on total walking distance.

(10) Different distance threshold for Picking Location Clustering — (Image by Author)

The trade-off between Walking Distance between two locations and Wave Size:

Low Distance: The walking distance between two locations is low, but you have fewer orders per wave (more waves)
High Distance: The walking distance between two locations is higher, but you have more orders per wave (fewer waves)

(11) Results for 5,000 lines grouped in Waves of 9 orders with Distance Threshold in [1, 95] (m) — (Image by Author)

We can find a local minimum for Distance_Threshold = 60 m, where the distance is reduced by 39% vs. Distance_Threshold = 1 m.

(11) Results for 20,000 lines grouped in Waves of 9 orders with Distance Threshold in [1, 95] (m) — (Image by Author)

We can find a local minimum for Distance_Threshold = 50 m, where the distance is reduced by 27% vs. Distance_Threshold = 1 m.

💡 Follow me on Medium for more articles related to 🏭 Supply Chain Analytics, 🌳 Sustainability and 🕜 Productivity.

IV. Next Step

Advanced Diagnostic using Process Mining

Process mining is a type of data analytics that focuses on discovering, monitoring, and improving business processes.

This involves analyzing data from various sources, such as process logs, to understand how a process is being executed, identify bottlenecks and inefficiencies, and suggest ways to improve it.

Example of Process Mining for Distribution Process — (Image by Author)

Your Warehouse Manage System (WMS) will record every step of the picking process

Start of the wave by the operator
The first item picked, the second item picked …
The last item picked

A solution using process mining can support the automation of the diagnostic of productivity issues by targeting bottlenecks.

For more information,

What is Process Mining?

Learn how to use Python for Process Mining and unlock the power of your business data with this comprehensive guide.

towardsdatascience.com

Optimize the Warehouse Layout with Pareto Analysis

When the Italian economist Vilfredo Pareto developed a mathematical formula to describe the distribution of wealth in Italy, he discovered that 80% of the wealth belonged to 20% of the population.

A few decades later, this rule has been generalized to many other applications, including Supply Chain and Logistics Management.

Heatmap of the volumes picked per location — (Image by Author)

This can optimize the layout by grouping high rotations and maximising the picking and replenishment productivity by

Reducing the walking distance for most of the orders
Limiting the congestion in the alleys
Reducing the space by optimizing picking location types

For more details,

Reduce Warehouse Space with the Pareto Principle using Python

How the 80/20 rule implemented using python can optimize your layout, reduce space utilization and improve the picking…

towardsdatascience.com

Next step

Based on this feedback, the next steps will be:

Picking Route Creation: How can we find the best route minimizing walking distance for a list of Picking Locations to cover?

Improve Warehouse Productivity using Pathfinding Algorithm with Python

Implement Pathfinding Algorithm based on Travelling Salesman Problem Designed with Google AI Linear Optimization…

towardsdatascience.com

About Me

Let’s connect on Linkedin and Twitter; I am a Supply Chain Engineer using data analytics to improve logistics operations and reduce costs.

If interested in Data Analytics and Supply Chain, look at my website.

Samir Saci | Data Science & Productivity

A technical blog focusing on Data Science, Personal Productivity, Automation, Operations Research and Sustainable…

samirsaci.com

References

[1] Samir Saci, Improve Warehouse Productivity using Order Batching with Python, Link

Improve Warehouse Productivity using Spatial Clustering with Python

Improve Warehouse Picking Productivity by Grouping Orders in Batches using Picking Location Spatial Clusters.

I. Order Wave using Picking Locations Clustering

1. Picking locations clustering using Scipy

2. Picking locations clustering for Multi-line Orders

II. Model Simulation

1. Comparing 3 methods of Wave Processing

2. Tuning Distance Threshold for Clustering

IV. Next Step

Advanced Diagnostic using Process Mining

What is Process Mining?

Learn how to use Python for Process Mining and unlock the power of your business data with this comprehensive guide.

Optimize the Warehouse Layout with Pareto Analysis

Reduce Warehouse Space with the Pareto Principle using Python

How the 80/20 rule implemented using python can optimize your layout, reduce space utilization and improve the picking…

Next step

Improve Warehouse Productivity using Pathfinding Algorithm with Python

Implement Pathfinding Algorithm based on Travelling Salesman Problem Designed with Google AI Linear Optimization…

About Me

Samir Saci | Data Science & Productivity

A technical blog focusing on Data Science, Personal Productivity, Automation, Operations Research and Sustainable…

References

Written by Samir Saci