Unlocking Behavioural Secrets to Overcome Churn Extremes

Understand, Predict, and Minimise Customer Churn

Published in

Towards Data Science

7 min readDec 31, 2019

Introduction

In the new global economy, customer retention has become a central issue in today’s turbulent business environment and presents a significant challenge for many service companies. Of particular concern is cost-cutting and intensive competitive pressure. There is increasing concern that companies not fully exploiting their existing customer base are being disadvantaged.

Academics have generated a large body of research that addresses part of that challenge — with a particular focus on predicting customer churn. However, there is still a need to understand the various perceptions of churn that exist among businesses. The central purpose of managing customer relationships is for the enterprise to focus on increasing the overall value of its customer base — and customer retention is critical to its success.

This blog series describes the design and implementation of approaches that can be used to control customer churn, focusing on proactive churn management, where the customer is contacted ahead of when he or she is predicted to churn, and provided a service or incentive designed to prevent the customer from churning. By employing qualitative and quantitive modes of enquiry, I attempt to illuminate the predictive modelling of customer churn and present a framework for developing a proactive churn management program.

The Problem

Using a simple retention model, the lifetime value of a customer is:

Central to the entire discipline of customer churn management is the concept of retention, r. At the customer level, churn refers to the probability the customer leaves the firm in a given time period. At the firm level, churn is the percentage of the firm’s customer base that leaves in a given time period. Churn is, therefore, one minus the retention rate:

However, there are certain drawbacks associated with the customer churn for any industry where a simple retention lifetime value model is applicable, i.e., where customers can leave and not naturally return without a significant re-acquisition effort. This is exemplified in many services such as magazine and newsletter publishing, investment services, insurance, electric utilities, health care providers, credit card providers.

There are two basic types of churn: subscription churn and non-subscription churn

Subscription churn happens in businesses where users or customers are on contract for a set period of time (monthly, annually, etc. — think cable, network, or phone providers), and customers choose not to come back after that contract is up. As was pointed out in the definition it is easy to define, predict, and prevent since there’s a clear, defined window with the risk of churn where marketing activities can be focused.

Non-subscription churn refers to event characterised by users or customers where they can end their relationship with business at any time — they come and go at will. A customer may gradually over time reduce their purchase frequency, or they may all of a sudden never buy again. This blog will focus on the process of preventing subscription churn.

Data Acquisition

Traditionally, churn prediction has been assessed by measuring simply some form of customer identification and a date/time of that customer’s last interaction. This data, though not incredibly detailed, would allow you to build models to predict churn at a basic level. However, the reality is that adding additional data on top of this minimum data set is recommended and highly encouraged. The more data included, the better the churn predictions will be, so if available, also include things in the dataset like static demographic information about users, details on specific types of user actions, etc. The more sources, the better.

Selected Data

I am going to use “Telco Customer Churn” data [IBM Sample Data Sets] to predict behaviour to retain customers. In this dataset ach row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

Customers who left within the last month — the column is called Churn
Services that each customer has signed up for — phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
Customer account information — how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
Demographic info about customers — gender, age range, and if they have partners and dependents

Data Preprocessing

Data preprocessing is a data mining technique which is used to transform the raw data in a useful and efficient format. This usually can account for up to 80% of the total time spent on the project as data are gathered from multiple sources at various time points. Understanding different variables in your data fundamental to build intuitions. Thus, it’s advisable to allocate enough time to this part before moving on to cleaning up different spellings or possibly missing data to ensure everything is homogeneous. Thoroughly exploring and cleaning will save time in subsequent steps, particularly when it comes time for a prediction.

Churn distribution shows that we are dealing with an imbalanced problem as there are many more non-churned as churned users. Roughly a quarter of our sample are no longer customers (Figure 1). This would have an implication when we will build a classification model. I will go into more detail during the churn modelling process.

Data Visualisation and Feature Selection

Feature extraction aims at reducing the number of variables (attributes) by leaving the ones that represent the most discriminative information. Feature extraction helps to reduce the data dimensionality (dimensions are columns with attributes in a dataset) and exclude irrelevant information.

During feature selection, one can revise previously extracted features and define a subgroup of them that’s most correlated with customer churn. As a result of feature selection, we can have the dataset with only relevant features.

As shown in Figure below, the majority of the customer streamed movies and TV streaming (Figure 2). Likewise, the majority of the customers have paperless billing and do not have online security.

The figure below shows most of the customers have phone service with a single phone line. Fibre optic internet connection is more popular than DSL internet service, and each online service has a minority of users. Approximately half of the sample are on month-to-month contracts with the remaining split between one and two-year contracts (Figure 3).

The bottom half of the figure shows payment method and the breakdown of the tenure variable is stacked at the tails as a large proportion of customers have the shortest (0–12 month) tenure.

The results of the correlational analysis are summarised in Figure 4. At a glance, you can see that there is a relationship between churn and number of variables; with varying strength e.g. contract type, total charges, internet services, etc. Sometimes it is clear that there is a causal relationship. However, correlation does not mean that the changes in one variable actually cause the changes in the other variable. In statistics, you typically need to perform a randomised, controlled experiment to determine that a relationship is causal rather than merely correlation.

Conclusions

We have discussed the churn problem, its causes, and some of the ways to find patterns and build intuition from the exploratory data analysis. Next time, we will take a look at how predictive models can identify churners and unlock the reasons they churn, and how companies can manage churn.

👋 Thanks for reading. If you enjoy my work, don’t forget to like, follow me on medium. It will motivate me in offering more content to the Medium community ! 😊

References:

Ascarza et al (2018): In pursuit of enhanced customer retention management: Review, key issues, and future directions. Springer Science+Business Media, LLC 2017

Blattberg et al (2008): “Database Marketing: Analyzing and Managing Customers”. Springer

Ghorbani and Taghiyareh (2009): CMF: A Framework to Improve the Management of Customer Churn. IEEE Asia-Pacific Services Computing Conference (IEEE APSCC)

Han et al. (2011): CHan, J., Pei, J., and Kamber, M. (2011). Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems. Elsevier Science

“gist-syntax-themes”: https://github.com/lonekorean/gist-syntax-https://businessscience.github.io/correlationfunnel/articles/introducing_correlation_funnel.html

https://blogs.rstudio.com/tensorflow/posts/2018-01-11-keras-customer-churn/