Monitoring repeat purchase behavior is essential in ensuring a profitable retail or e-commerce business. Repeat customers generally spend more per purchase and are more likely to recommend the store or product to other customers, boosting new customer acquisition and revenue.
Automated reporting dashboards help business owners and analysts get real-time visibility on repeat purchase levels. In this post, I shall demonstrate how this can be done using Google Data Studio and BigQuery.
If you work with an e-commerce business, chances are you already have a Google Analytics account and are able to view retention rates there. The advantage of building this dashboard is that you can customize the views to your needs and add metrics that are uniquely helpful for your business. Aside from that, you can integrate the dashboard with other data sources easily, say, if you are able to get data from surveys or from social media. All the insights are in one place, so decisions can be made more swiftly.
This is a sample repeat purchase and retention dashboard done in Google Data Studio:
This article features Google Data Studio and Big Query, but I will focus more on creating the analytics and metrics rather than teaching the basics about the tools. For beginners, I will put some resources towards the end of the article for your reference.
The dataset
The data used for this demo is the Online Retail Dataset from the UCI Machine Learning Repository. It is a dataset containing transactions from an online retail store. You can use any transactional data containing customer IDs, order IDs, order dates, and sales to be able to replicate this dashboard.
The data was downloaded from the website and uploaded manually to Google Bigquery, a scalable and cost-effective data warehouse. If your data comes from an online store, integrations can be set up so that the data flows from the shop to a third-party connector, and then to your BigQuery warehouse. If you have this setup, the data will refresh regularly, ensuring that you get the most updated insights about your business.
I will not go into detail on how to set up the data warehouse as I intend to focus more on the dashboard analytics, but BigQuery has quite extensive documentation here, for reference.
Connecting Data Studio to the BigQuery Dataset
After setting up the data warehouse, the next step is to create a Data Source inside Google Data Studio. Here are the steps to do this:
- On the Data Studio home page, click Create. Choose Data Source.
- A variety of Google Connectors will appear. Choose BigQuery.
- Choose the BigQuery project and dataset for the transactional data.
- Click Connect.
- All the fields inside the data will appear. Check that the data formats are correct, and add custom calculated fields when needed. For this dashboard, I added the following:
% Repeat Customers:
COUNT_DISTINCT(CASE WHEN RepeatPurchase = 'Repeat Customer' THEN CustomerID ELSE NULL END) / COUNT_DISTINCT(CustomerID)
Average Order Value (AOV): SUM(Sales)/COUNT_DISTINCT(InvoiceNo)
Purchase Frequency: COUNT_DISTINCT(InvoiceNo) / COUNT_DISTINCT(CustomerID)
- Click Create a Report.
Now you will see a blank dashboard where you can put all the chart elements needed.
Creating the Dashboard
Let’s break the dashboard components down into key elements:
- Page Filters
- Topline Performance on Key Metrics
- Monthly Trends in Repeat Purchase
- Repeat Purchase Rates (aka Cohort Analysis)
- Products Bought by Repeat Purchasers
Page Filters
Page-level filters are usually the ones you see at the upper portion of dashboards. It provides the interactivity feature as it enables the user to drill down into the performance of specific groups. In our dashboard, we are able to select which country to focus on (I chose the UK as default), product types, repeat purchase groups (first-time buyers versus repeat buyers), and transaction date.
Topline Performance on Key Metrics
The first element of the dashboard should contain the most important business metrics to see. I prefer to have these as simple scorecards – no fancy charts, straight up the figures you need to see about the business health. I also prefer to use compact formatting – rounding off with at most 1-digit decimal places and putting thousands (K) or millions (M) markers instead of showing the exact values. This makes the charts less cluttered while not losing much of the needed information.
Positive and negative growth indicators (green and red) are a great thing to include. Without using any additional chart, you’ll see if the business performed over or under the previous period on a specific metric.
CODE: How to create Repeat Customer Flags using BigQuery
One of the key metrics highlighted in our scorecards is Repeat Customers (Customers who have already bought from the store before). This metric cannot be calculated in Data Studio, but can be easily done in BigQuery with this code:
SELECT *,
CASE WHEN customer_seq > 1 THEN 'Repeat Customer'
ELSE 'New Customer'
END AS RepeatPurchase
FROM (
SELECT *,
RANK() OVER (PARTITION BY CustomerID ORDER BY date) AS customer_seq
FROM online_retail
)
The new field RepeatPurchase is a string field that indicates if the customer is a ‘Repeat Customer’ or a ‘New Customer’.
Monthly Trends in Repeat Purchase
Needless to say, it is necessary to see the important metrics on a monthly view. By doing this, one is able to check the trends and seasonal patterns and act on them accordingly.
In this dashboard, I first displayed the raw counts of repeat purchase customers along with purchase frequency. The second chart shows the percentage of repeat buyers versus first-time buyers. Looking at both raw counts and percentages, it appears that the store is getting more repeat buyers across time, which is a good sign.
Repeat Purchase Rates (aka Cohort Analysis)
This analysis shows the detailed repurchase rates by focusing on customer "cohorts". In this case, we define a cohort as a group of customers buying for the first time on a specific month. In the example, the first cohort is the Jan 2011 cohort. The columns represent the reference month, i.e. after n number of months, what is the repurchase rate?
From this table, 24% of Jan 2011 first-time purchasers are repurchasing from the store a month after, 24.2% repurchase rate 3 months after, and 36.8% repurchase rate 10 months after.
The "Grand Total" row represents the average repurchase rate regardless of which cohort we’re looking at. Overall, the shop has a 1-month repurchase rate of 23.7%, 3-month repurchase rate of 23.5%, and 10-month repurchase rate of 36.8%.
CODE: How to Execute Cohort Analysis Using BigQuery
This code snippet shows how to calculate the repurchase rates in the cohort table:
The dashboard also contains a cohort analysis for the percentage of sales – for the sake of brevity, the code is not shown here as it’s essentially the same as the above (instead of counting the customers, compute for the sum of sales).
Products Bought by Repeat Purchasers
The last portion of the dashboard focuses more on the details – which products contribute most in terms of repeat purchase customers and sales? Just a simple table for this will go a long way. While not covered here, an analysis to determine profitable product combinations, such as Market Basket Analysis, can provide marketing and promotion insights for your business.
Summary
In this article, I introduced the value of a Repeat Purchase Dashboard to monitor a retail shop’s performance:
- Seeing the topline performance at one glance
- Visualizing the monthly trends in repeat purchase
- Computing the repeat purchase rates through cohort analysis
This example focused on online retail sales data but can be used for other sales data sources with a similar data structure. Any transactional data containing customer IDs, order IDs, order dates, and sales can be used to replicate this dashboard.
I also highlighted the capability of Google Data Studio for making these kinds of reports, and how BigQuery reinforces it to make even more advanced analytics through the Sql language.
All the code in this article can be accessed on my GitHub. Happy analyzing!
Useful Resources for Google Data Studio and Big Query: