The world’s leading publication for data science, AI, and ML professionals.

Exploring Different Approaches to Generate Response Curves in Marketing Mix Modeling

Comparing Saturation Function and Partial Dependence for Response Curve Generation

Photo by Alexander Grey on Unsplash
Photo by Alexander Grey on Unsplash

Response curves are an essential component of marketing mix modeling, which is a statistical technique used to analyze the impact of various marketing strategies and tactics on sales or other business outcomes. Response curves represent the relationship between a marketing variable (e.g., advertising spend, price, promotion, etc.) and the sales or revenue generated by a product or service.

The importance of response curves lies in their ability to reveal the effectiveness of each marketing variable and how it contributes to overall response. By analyzing response curves, marketers can gain valuable insights into which marketing tactics are driving the most sales and which are not delivering the desired results.

There are different approaches to building response curves, and in this article, I explore two prominent methods: the straightforward approach using saturation transformation and the approach based on partial dependence. I evaluate these approaches using two different families of algorithms: linear regression and gradient boosting. Moreover, I show that the partial dependence approach can be used in conjunction with response curves generated by SHAP values when using complex machine learning algorithms.

Saturation Functions / Transformations

The straightforward approach to building response curves involves using saturation functions (transformations) such as Logistic, Negative Exponential or Hill. A saturation function is a mathematical function that captures the diminishing returns effect, where the impact of a marketing variable saturates as its value increases. By using a saturation function, the relationship between the marketing variable and the response variable can be transformed into a non-linear form. This enables the model to capture the saturation effect and more accurately represent the true relationship between marketing efforts and the response (sales or revenue).

Modeling Marketing Mix using PyMC3

One advantage of using a saturation transformation is its simplicity and interpretability. The response curve is defined by a mathematical function with fixed parameters resulting in a smooth curve that can be easily visualized. However, the choice of the saturation function is an important consideration prior to modeling. Different functions may yield different results, and the selection should be based on the characteristics of the data and the underlying assumptions of the model.

Linear Regression and the Need for Non-Linearity

In marketing mix modeling, linear regression is a commonly used technique to analyze the relationship between marketing variables and the response variable. However, linear regression assumes a linear relationship between the predictor variables and the response variable. This can pose a limitation when trying to capture non-linear relationships that often exist in marketing data.

To overcome this limitation and introduce non-linearity into the modeling process, it becomes necessary to apply a saturation function or transformation to the marketing variables. This transformation allows for the generation of a non-linear relationship that would otherwise be linear due to the nature of linear regression.

Modeling Marketing Mix Using Smoothing Splines

Partial Dependence Approach

The partial dependence approach is a more general method that can be used to model the relationship between any marketing variable and the response. This approach involves isolating the effect of one variable while holding all other variables constant. By varying the value of the marketing variable of interest and observing the corresponding response, a plot of the partial dependence can be created.

Unlike the smooth response curve generated by saturation transformations, the resulting plot from the partial dependence approach may not necessarily be smooth. Its shape depends on the underlying modeling algorithm, and relationship between a media variable and the response. The partial dependence approach can be useful when the relationship is complex and non-linear, and it can be applied in cases when a saturation transformation is explicitly used or when algorithms handle non-linearity naturally without the need for additional saturation transformation.

Improving Marketing Mix Modeling Using Machine Learning Approaches

Data

I continue using the dataset made available by Robyn under MIT Licence as in my previous articles for practical examples, and follow the same data preparation steps by applying Prophet to decompose trends, seasonality, and holidays.

The dataset consists of 208 weeks of revenue (from 2015–11–23 to 2019–11–11) having:

  • 5 media spend channels: tv_S, ooh_S, print_S, facebook_S, search_S
  • 2 media channels that have also the exposure information (Impression, Clicks): facebook_I, search_clicks_P (not used in this article)
  • Organic media without spend: newsletter
  • Control variables: events, holidays, competitor sales (competitor_sales_B)

Modeling

I built a complete working MMM pipeline that can be applied in a real-life scenario for analyzing media spend on the response variable, consisting of the following components:

A note on coefficients

In scikit-learn, Ridge Regression does not provide a built-in option to enforce positive coefficients for a subset of variables. However, a potential workaround involves rejecting the optuna solution if any of the media coefficients are found to be negative. This can be accomplished by returning an exceptionally large value, signaling that the negative coefficients are unacceptable and should be excluded from the model. An alternative approach could be to refer to my article on how to wrap an R glmnet in Python, which enables the constraint of coefficients for a subset of variables.

For ridge regression, I apply a saturation transformation and generate response curves using both the saturation function and partial dependence approaches. With Lightgbm, I allow the model to naturally capture non-linearities and generate response curves using the partial dependence approach. Additionally, I overlay SHAP values on the response curve to provide further insights.

Results

Ridge Regression with saturation transformation

As can be observed, both response curves generated using the saturation function and partial dependence exhibit overlapping patterns, indicating that the two methods capture similar relationships between the marketing variable and the response.

LightGBM

As mentioned previously, the resulting response curve generated by the partial dependence may not necessarily be smooth. One reason for this can be attributed to the nature of gradient boosting algorithms, which involve splitting the feature space into regions and incorporating interactions among multiple decision trees.

Image by the author
Image by the author

The plots below depict the response curves for Ridge Regression and LightGBM, highlighting the disparities between the two algorithms in capturing diminishing returns. Furthermore, we observe that the SHAP values offer a reliable approximation of the response curves generated by the partial dependence approach.

Image by the author
Image by the author
Image by the author
Image by the author
Image by the author
Image by the author
Image by the author
Image by the author
Image by the author
Image by the author

Conclusion

Response curves play a crucial role in Marketing Mix Modeling by providing insights into the effectiveness of different marketing variables and their contribution to overall response. In this article, I explored two prominent methods for generating response curves: the straightforward approach using saturation transformations and the partial dependence approach. I evaluated these approaches using two families of algorithms, linear regression and gradient boosting, and demonstrated the contrasting ways in which different algorithms capture non-linear response. Additionally, I compared the responses generated using SHAP values to the results obtained through the partial dependence approach

The complete code can be downloaded from my Github repo

Thanks for reading!


Related Articles