
Table of Content
- Introduction
- What You’ll Learn
- What is a Marketing Mix Model?
- Walkthrough of the Marketing Mix Model
If you enjoy this, be sure to subscribe to never miss another article on Data Science guides, tricks and tips, life lessons, and more!
Introduction
Due to high demand, I’m back with another step-by-step data science project with Python code! This one is pretty interesting because there’s so much more that you can do that goes beyond what I’m about to present – however, I believe that this provides a good start for anyone that’s interested in marketing and data science.
This project is related to a common real-life problem that many businesses face – marketing attribution. This is the science of determining how much each marketing channel is contributing to sales/conversions. The difficulty typically arises when you introduce offline marketing channels, like TV or radio, because there’s no direct way of measuring the impact of these channels.
What You’ll Learn
- You’ll learn what a Marketing Mix Model (MMM) is and how you can use it to assess various marketing channels
- You’ll learn fundamental concepts and techniques when you explore your data
- You’ll learn what Ordinary Least Squares (OLS) regression is, how to implement it, and how to assess it
What is a Marketing Mix Model?
A Marketing Mix Model (MMM) is a technique used to determine market attribution. Specifically, it is a statistical technique (usually regression) on marketing and sales data to estimate the impact of various marketing channels.
Unlike Attribution Modeling, another technique used for marketing attribution, Marketing Mix Models attempt to measure the impact of immeasurable marketing channels, like TV, radio, and newspapers.
Generally, your output variable will be sales or conversions, but can also be things like website traffic. Your input variables typically consist of marketing spend by channel by period (day, week, month, quarter, etc…), but can also include other variables which we’ll get to later.
Walkthrough of the Marketing Mix Model
For this project, we’re going to use a fictional dataset that consists of marketing spend on TV, radio, and newspaper, as well as the corresponding dollar sales by period.
Dataset is here.
Setup
First, we’re going to import the libraries and read the data, as usual.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv("../input/advertising.csv/Advertising.csv")
Understand my Variables
Next, we’re going to look at the variables in the dataset and understand what we’re working with.
print(df.columns)
df.describe()

Instantly, you can see that the variable, Unnamed: 0, is essentially an index starting at 1 – so we’re going to remove it.
df = df.copy().drop(['Unnamed: 0'],axis=1)
Because this is a fictional and simple dataset, there are a lot of steps that we don’t have to worry about, like handling missing values. But generally, you want to make sure that your dataset is clean and ready for EDA.
Exploratory Data Analysis (EDA)
The first thing that I always like to do is create a correlation matrix because it allows me to get a better understanding of the relationships between my variables in a glance.
corr = df.corr()
sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, annot=True, cmap=sns.diverging_palette(220, 20, as_cmap=True))

Immediately, we can see that there’s a strong correlation between TV and sales (0.78), a moderate correlation between radio and sales (0.58), and a weak correlation between newspaper and sales (0.23). It’s still too early to conclude anything but this is good to keep into consideration moving forward.
Similarly to the correlation matrix, I want to create a pairplot of my variables so that I can understand the relationships between my variables even more.
sns.pairplot(df)

This seems to be in line with the correlation matrix, as there appears to be a strong relationship between TV and sales, less for radio, and even less for newspapers.
Feature Importance
Feature importance allows you to determine how "important" each input variable is to predict the output variable. A feature is important if shuffling its values increases model error because this means the model relied on the feature for the prediction.
We’re going to quickly create a random forest model so that we can determine the importance of each feature.
# Setting X and y variables
X = df.loc[:, df.columns != 'sales']
y = df['sales']
# Building Random Forest model
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error as mae
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.25, random_state=0)
model = RandomForestRegressor(random_state=1)
model.fit(X_train, y_train)
pred = model.predict(X_test)
# Visualizing Feature Importance
feat_importances = pd.Series(model.feature_importances_, index=X.columns)
feat_importances.nlargest(25).plot(kind='barh',figsize=(10,10))

There seems to be a pattern, where TV is the most important, followed by radio, leaving newspaper last. Let’s actually build our OLS regression model.
OLS Model
OLS, short for Ordinary Least Squares, is a method used to estimate the parameters in a linear regression model. Check out my article, Linear Regression Explained in 5 Minutes if you don’t know what regression is!
What makes Python so amazing is that it already has a library that we can use to create an OLS model:
import statsmodels.formula.api as sm
model = sm.ols(formula="sales~TV+radio+newspaper", data=df).fit()
print(model.summary())

.summary() provides us with an abundance of insights on our model. I’m going to point out two main things that are most useful for us in this:
- The Adj. R-squared is 0.896, which means that almost 90 of all variations in our data can be explained by our model, which is pretty good! If you want to learn more about r-squared and other metrics that are used to evaluate Machine Learning models, check out my article here.
- The p-values for TV and radio are less than 0.000, but the p-value for newspaper is 0.86, which indicates that newspaper spend has no significant impact on sales.
Next, let’s graph the predicted sales values with the actual sales values to visually see how our model performs:
# Defining Actual and Predicted values
y_pred = model.predict()
labels = df['sales']
df_temp = pd.DataFrame({'Actual': labels, 'Predicted':y_pred})
df_temp.head()
# Creating Line Graph
from matplotlib.pyplot import figure
figure(num=None, figsize=(15, 6), dpi=80, facecolor='w', edgecolor='k')
y1 = df_temp['Actual']
y2 = df_temp['Predicted']
plt.plot(y1, label = 'Actual')
plt.plot(y2, label = 'Predicted')
plt.legend()
plt.show()

Not bad! It seems like this model does a good job of predicting sales given TV, radio, and newspaper spend.
Taking it a step further
In reality, the data probably won’t be as clean as this and the results probably won’t look as pretty. In practice, you’ll probably want to consider more variables that impact sales, including but not limited to:
- Seasonality: It’s almost always the case that company sales are seasonal. For example, a snowboard company’s sales would be much higher during the winter than in the summer. In practice, you’ll want to include a variable to account for seasonality.
- Carryover Effects: The impact of Marketing is not usually immediate. In many cases, consumers need time to think about their purchasing decisions after seeing advertisements. Carryover effects account for the time lag between when consumers are exposed to an ad and their response to the ad.
- Base sales vs incremental sales: Not every sale is attributed to marketing. If a company spent absolutely nothing on marketing and still made sales, this would be called its base sales. Thus, to take it a step further, you could try to model advertising spend on incremental sales as opposed to total sales.
I hope you enjoyed this project – let me know what other kinds of projects you’d like to see!
Thanks for reading!
If you like my work and want to support me…
- The BEST way to support me is by following me on Medium here.
- Be one of the FIRST to follow me on Twitter here. I’ll be posting lots of updates and interesting stuff here!
- Also, be one of the FIRST to subscribe to my new YouTube channel here! There are no videos yet but it’s coming!
- Follow me on LinkedIn here.
- Sign up on my email list here.
- Check out my website, terenceshin.com.