The world’s leading publication for data science, AI, and ML professionals.

SyriaTel Customer Churn Analysis

For my third project at the Flatiron School, I chose to analyze the dataset on customer churn for a telecommunications company, SyriaTel…

For my third project at the Flatiron School, I chose to analyze the dataset on customer churn for a telecommunications company, SyriaTel. The objective was to build a classifier to determine if a customer would ‘soon’ leave SyriaTel, and to determine if there were predictable patterns.

The data provided no time information but rather had a ‘Churn’ feature which indicated whether the customer churned or not. The project was a Binary Classification task.

I posed three questions to answer with my classifier.

  1. What features of the dataset are primary determinants of Customer Churn and to what extent?
  2. What are the ways that these findings can be interpreted and how can SyriaTel implement cost-effective solutions?
  3. Will these solutions be feasible in reducing the customer churn rate by at least 7%?

By beginning with a business understanding, I learned that an acceptable amount of customer churn is around 7% annually. However, the dataset indicated that SyriaTel had a customer churn of about 15%. There was no indication of the time period, but I proceeded with the assumption that this churn occurred over the duration of a year.

The nature of this problem is framed by the fact that losing current customers is less expensive than gaining new customers. Therefore, it is much better to retain the customers that you currently have rather than to acquire new customers. Since solving this problem meant that SyriaTel could save money, I knew that the proposed business changes also needed to be taken under economic consideration, because if they were expensive then what was the point?

Building a Classifier

On a quest to build a classifier to service SyriaTel’s needs in the best way possible, I took into consideration which metric would make the most sense. I determined that incorrectly classifying a false negative would be worse than a false positive because a false negative would mean that the reality of a customer canceling would have been overlooked. The occurrence of the false-negative occurring is referred to as a type two error. In order to rank my classifiers on how well they minimized false negatives, I used the measurement of recall.

I moved through many models in order to maximize recall:

  • Logistic Regression
  • K-Nearest Neighbors
  • Decision Tree
  • Bagging + Decision Tree
  • Gradient Boost
  • Adaboost
  • Random Forest
  • XGBoost
  • XGBoost + GridSearchCV

XGBoost with GridSearchCV performed the best with a test recall of 76.24% after tuning the model. I was able to see which features were contributing to churn, but I wanted more information.

SHAP (SHapley Additive exPlanations)

After building a decent classifier, I took a leap outside of my familiar tools by using Shap to explore to what extent each feature affected customer churn. This enabled me to identify meaningful insights and recommend changes to the SyriaTel business model.

Reading the graph below: When you see the red points for a feature, it indicates a high feature value. If the feature has a tail going to the right, it means that those values are causing an impact on model output that is pushing customer churn from zero (not churning) to one (customer churn).

As I observed features by SHAP values, I learned the following:

Contributors of High Customer Churn: Value of One

  • High total number of day minutes
  • High number of customer service calls
  • Customers who have an international plan
  • High number of night minutes
  • High number of international minutes

Contributors of Low Customer Churn: Value of Zero

  • Customers with a voicemail plan
  • Customers with higher number of voicemails
  • High number of international calls

Findings

Within the dataset, it was evident that the SyriaTel business model was to charge customers based on the number of minutes that they used. However, it is evident that within the contributors of high customer churn, all of the factors are leading to a higher bill that is deterring the customer from continuing their phone plan.

Recommendation for SyriaTel

My recommendation for SyriaTel was to create a flat monthly fee for its users so that they would be more likely to stay with the company. By calculating the average charge per user, I found that it is about $60. For this business model to work for both SyriaTel and their customers, the best solution would be for them to charge a monthly fee to the demographic of customers who are charged $40 or less monthly and a higher tier plan for the users who use their phones more.

Conclusion and Further Work

In conclusion, for SyriaTel to get their customer churn to an adequate level, they would need to decrease their customer churn by 7.49% which is 247 customers. If they were to simply focus on the customers who were likely to churn before they were able to by using the classifier, they would be able to predict 75% of potential churns. From their 15% of churns that would happen with no action, they would be able to predict 10% of the customers that would soon churn. If they took action to retain their customers and succeeded with 8 of 10, they would reduce their overall churn by 8%, which would put their churn in an adequate range.

Going forward after improving their business model, customer churn would indicate other undesirable factors from a customer perspective. When customers leave, they are going to the competition. It is unlikely that someone with a high cell phone bill just decides to not have a phone at all. Understanding churn factors will not only allow SyriaTel to understand why their customers are leaving, but also why their customers are leaving for their competitors. Overall, this will lead to the opportunity for SyriaTel to sharpen their attractiveness in the eyes of their customers by competing in the market well.


Related Articles