User Segmentation Based on Their Purchase History

A use case on how we target users with different products based on their purchasing power using machine learning.

muffaddal qutbuddin
Towards Data Science

--

source: Lisa, via pixabay

Introduction

The goal of this analysis was to identify different user groups based on the deals they have availed, using a discount app, in order to re-target them with offers similar to ones they have availed in the past.

Machine learning algorithm K-means was used to identify user segments based on their purchase behavior. Here is a 3-D illustration of what algorithm extracted.

Four user segments created by k-means algorithm using purchase history of users
3D image of clusters produced by K-Means, by Muffaddal

Want to enhance your targeting with personalized user segmentation? Let’s talk

Terminologies:

Before going deeper into the analysis, let’s define some keywords being used.

Deal Avail: When user avails discount using app.
Spent: Discounted price user pays while buying an item.
Saved: Amount user saved through the app.
Brands: Vendors for which discounts are being offered such as Pizza Hut, GreenO
Deals: Discounts offered to users on different outlets and brands.

Analysis

Data sets

The behavior data set was extracted from Mixpanel using JQL. Following was used for this analysis

Mixpanel Data Set, by Muffaddal

userId: unique id of user
saveAmount: amount saved by user on deal avail
spentAmount: amount spent by user on deal avail
brandName: brand for which deal was availed
count: number of deals availed by user

Using the above data set averageSpentAmount, averageSavedAmount and dealAvailCount was calculated for each user as seen below

Average Deal Availed Data set, by Muffaddal

Machine Learning — K-means Clustering
The first step of the k-mean algorithm was to find an optimal number of clusters for segmentation. There are a number of methods out there for this purpose, one of which is the elbow method using within-cluster sum square (wcss).

WCSS for up-to 10 clusters, by Muffaddal

Based on the elbow method, 4, 5, and 6 clusters were used to explore the segments and 4 clusters were picked as best for the given data set.

R code for K-Means clustering

I would recommend these courses on Data camp and Coursera if you want to learn more about user clustering and user segmentation.

What Segments K-means extracted?

Following were average stats of four identified segments:

Average stats of each segment
Segments Characteristics
Graphical Representation of Segments Characteristics, by Muffaddal

Users in segment 1 and 2 were high paying users with segment 1 users also had saved equally high per deal(probably availed buy 1 get 1 offers). However, the number of deals availed by these users were less than 2 (i.e. 1.3 and 1.4 respectively).

On the other hand, segment 3 and segment 4 users spent less and hence, saved less as well. However, segment 4 users had the greatest deal availed per user ratio (on average more than 9 deals availed by each user) in all 4 segments. It was the most converted cohort of users.

What were the total number of users and the number of deals availed in each segment?

Here is the total number of users and deals each segment users had availed.

Number of users in segments, by Muffaddal
Number of deals availed, by Muffaddal

57% of users belonged to segment 3 and only 3% of users were from the most converted segment (i.e segment 4).

What were overall users spending?

Here is the spread of spending by each segment

Spending of users in each segment, by Muffaddal

Some of the users from segment 4 had high spending (yellow dots in segment 4) similar to segment 1 and 2 but segment 3 (which comprise of 57% of the users) didn't go for high spending deals and/or brands at all.

Type of brand each segment users preferred?

Let’s look at what type of brand these segment users avail to understand any distinction in them.

Brands users availed, by Muffaddal

Segment 1 users had availed mix of burger, pizza and fun time, Segment 2 users had availed pizza and segment 3 users had preferred burgers. While Segment 4 users (most converted users) preferred juices and other types of brands.

What brands each segment availed?.

Here are the top 10 brands these segmented users had availed.

Top 10 Brands Availed by Each Segments, by Muffaddal

Looking at the brands we can comprehend what type of brand and deals these segment users would prefer. Segment 1 & 2 users (high paying users) had availed premium brands such as Sajjad, kababi, Charcoal, California, etc while segment 3 and 4 (low paying users) had mostly opted in for medium to low tier brands.

How these results can be employed?

Based on different user segments we can:

1- Targeted Ads
Personalize ads for each segment would increase the conversion rate as users are more likely to convert on specific brands and offers. So, for example, show Sajjad’s ads to users with higher-paying power then to users with low paying power.

2- In-app Recommendations
Optimize the app to recommend deals and discounts within the app that each segment users would be more interested in.

Summary

To sum up, with data and proper efforts we were able to identify interesting information about users and their liking and were able to strategies how to engage users more based on their preferences.

Need help with User Segmentation? Let’s connect.

Similar Reads

--

--