The world’s leading publication for data science, AI, and ML professionals.

3 Best Approaches to Segment Your Customers

Learn the usage of hierarchical clustering, k-means clustering, and RFM segmentation

Customer segmentation

Today, the customer is at the center of everything. But, you can’t satisfy everyone. That’s the reality. The sooner you learn this the better it will serve you and your business. That’s why the first thing business analysts do is segment customers (both current and potential) into groups based on their needs, wants, and shared characteristics. Knowing the preferences of the customer allows you to design tailored strategies to win them and offer the best products and services. This is even more critical for Software as a Service (SaaS) businesses where the customer retention ratio (RR) is one of the key KPIs.

There are several ways to segment customers. Hierarchical clustering, Recency Frequency & Monetary (RFM) segmentation, and K-means clustering are among the popular ones.

Recently I’ve written a 3 part series where I describe more details how to perform customer segmentation. Check them out:

Introduction to customer segmentation

Customer segmentation with k-means clustering

Improve the k-means customer segmentation model with PCA

Hierarchical segmentation

According to the Wikipedia, hierarchical clustering is a method of cluster analysis that seeks to build a hierarchy of clusters.

Hierarchical clustering – Wikipedia

In hierarchical clustering, pairwise samples are grouped together based on similarity, and then they are merged to form the next level of hierarchy. Finally, it forms a dendrogram (a tree structure). The number of clusters to form is determined by expert analysts based on this dendrogram.

The following code produces a dendrogram based on a standardized customer dataset. The entire code can be found in the Deepnote Notebook.

K-means algorithm

K-means clustering is an unsupervised clustering algorithm. It attempts to group observations based on the nearest mean. Here are the steps required to implement the K-means algorithm.

  1. Choose the number of clusters (k)
  2. Assign initial centroids for each cluster (pass kmeans++ in the init parameter of scikit-learn’s KMeans class object).
  3. Assign each observation to a cluster where the centroid is the nearest based on the distance measures.
  4. Compute the revised centroids.
  5. Repeat steps 3 and 4 as long as the centroids keep changing

k-means clustering – Wikipedia

Once we know the optimal number of clusters, we can segment our customers by n_clustersparameter.

Once the model is trained with the data, we can get the assigned cluster label from the label_ parameter of the kmeans object.

RFM segmentation

Recency, Frequency, and Monetary segmentation (RFM) is a managerial Customer Segmentation process, which is very adaptable, and easy to understand. The key entities are:

  • Recency: Recency of a customer interaction.
  • Frequency: How often the customer makes a purchase.
  • Monetary: The total amount of money a customer spent.

The key element of RFM segmentation assigns static management weights to the 3 factors of a customer and calculates the final grade of each customers which determines the group of the customer.

Customer segmentation is the first step. The next step is to set up strong strategies. While implementing your strategy stay focused and periodically check to ensure you are on the right track.

All the best.


Thanks for reading! If you like the article make sure to clap (up to 50!) and let’s connect on LinkedIn and follow me on Medium to stay updated with my new articles.

Support me at no extra cost by joining Medium via this referral link.

Join Medium with my referral link – Asish Biswas


Related Articles