Forecasting future demand is a fundamental business problem and any solution that is successful in tackling this will find valuable commercial applications in diverse business segments. In the retail context, Demand Forecasting methods are implemented to make decisions regarding buying, provisioning, replenishment, and financial planning. Some of the common time-series methods applied for Demand Forecasting and provisioning include Moving Average, Exponential Smoothing, and ARIMA. The most popular models in Kaggle competitions for time-series forecasting have been Gradient Boosting models that convert time-series data into tabular data, with lag terms in the time-series as ‘features’ or columns in the table.
The Facebook Prophet model is a type of GAM (Generalized Additive Model) that specializes in solving business/econometric – time-series problems. My objective in this project was to apply and investigate the performance of the Facebook Prophet model for Demand Forecasting problems and to this end, I used the Kaggle M5- Demand Forecasting Competition Dataset and participated in the competition. The competition aimed to generate point forecasts 28 days ahead at a product- store level.
The dataset involves unit sales of 3049 products and is classified into 3 product categories (Hobbies, Foods, and Household) and 7 departments. The products are sold in 10 stores located across 3 states (CA, TX, and WI). The diagram gives an overview of the levels of aggregations of the products. The competition data has been made available by Walmart.
![Fig 1: Breakdown of the time-series Hierarchy and Aggregation Level [2]](https://towardsdatascience.com/wp-content/uploads/2020/07/1D6GIOUk0yv4veKFBK6ym4w.png)
![Fig 2: Data Hierarchy Diagram [2]](https://towardsdatascience.com/wp-content/uploads/2020/07/1_tjr78Wff__TW9KquusjkA.png)
Source:- https://mofc.unic.ac.cy/m5-competition/
Data Description
The data range for Sales Data is from 2011–01–29 to 2016–06–19. Thus products have a maximum of 1941 days or 5.4 years worth of available data. (The Test dataset of 28 days is not included).
The datasets are divided into Calendar Data, Price Data, and Sales Data [3].
Calendar Data – contains columns, like date, weekday, month, year, and Snap-Days for the states TX, CA, and WI. Additionally, the table contains information on holidays and special events (like Superbowl) through its columns event_type1 and event_type2. The holidays/ special events are divided into cultural, national, religious, and sporting [3].
Price Data– The table consists of the columns – store, item, week, and price. It provides information on the price of an item at a particular store, in a particular week [3].
Sales Data – consists of validation and evaluation files. The evaluation file consists of sales for 28 extra days which can be used for model evaluation. The table provides information on the quantity sold for a particular item in a particular department, in a particular state, and store [3].
The data can be found in the link – https://www.kaggle.com/c/m5-forecasting-accuracy/data
Data Analysis and Story Telling



As can be seen from the charts above, for every category, the highest number of sales occur in CA, followed by TX and WI. CA contributes to around 50% of Hobby sales. The sales distribution across categories in the three states is symmetric and the highest-selling categories ordered by descending order of sales in each state are Foods, Household, and Hobbies.
Analysis of Price Data




The charts above show that percentage_price_change is highly right-skewed. Log transformation is performed on sell_price to make its distribution more symmetric.
Effect of Snap-Days

The chart above shows that higher sales are observed on Snap-Days in all 3 states.
Investigating Seasonality in Sales Data


A seasonal decomposition is performed of the time-series using the _statsmodels.tsa.seasonaldecompose function. The charts above show a linear growth in sales over time (across categories and states) along with seasonal effects. Linearity is particularly evident in the latter half of the time-series starting from the year 2014. A yearly seasonality is seen in all states and categories.

The chart above shows a weekly seasonality across all 3 categories. Sales on the weekends and Monday are higher than on weekdays.

The above chart shows the monthly seasonality across categories. The pattern seen is that sales are high at the beginning of the month, then decline steadily and pick up again closer to the month-end.

The charts above show yearly seasonality across categories from 2011–2016. The sales behavior is symmetric within each category – i.e, Household sales 2011, is similar to Household sales 2012, and so on. This, historical data will prove useful in forecasting yearly sales in a category – for example, data on 2011 Household sales will be useful in predicting 2012 Household sales.
Model Description
The Prophet model is trained and predictions are made at a product-store level. Thus, 30490 different prophet models are trained for the 30490 different time-series at the product-store level. Two years of training data and 28 days of prediction/evaluation data is used for model training & evaluation on each series. The final 28 days of test data is hidden by Kaggle. This split of training, evaluation and test data is shown in the table below-

Two Models are tried across all time-series –
Model 1
def run_prophet(id1,data):
holidays = get_holidays(id1)
model = Prophet(uncertainty_samples=False,
holidays=holidays,
weekly_seasonality = True,
yearly_seasonality= True,
changepoint_prior_scale = 0.5
)
model.add_seasonality(name='monthly', period=30.5, fourier_order=2)
Model 2
def run_prophet(id1,data):
holidays = get_holidays(id1)
model = Prophet(uncertainty_samples=False,
holidays=holidays,
weekly_seasonality = True,
yearly_seasonality= True,
changepoint_prior_scale = 0.5
)
model.add_seasonality(name='monthly', period=30.5, fourier_order=2)
model.add_regressor('log_sell_price')
Model 1 consists of the components – holidays, weekly_seasonality, yearly_seasonality, and monthly seasonality.
Model 2 consists of the components – holidays, weekly_seasonality, yearly_seasonality, monthly seasonality, and additionally, the regressor log_sell_price = log(sales_price). The latest sales_price in each product-store series is assumed invariant over the 28 days forecasting horizon and is used for forecasting future sales.
The Facebook Prophet model is similar to a GAM (Generalized Additive Model ) and uses a decomposable timeseries model with three components – trend, seasonality and holidays – y(t) = g(t) + s(t) + h(t) + e(t) [4].
Growth g(t): By default Prophet allows you to use a linear growth model for forecasts. This model is being used here [4].
Holidays h(t): – Prophet considers holiday effects. Holidays modeled here are religious holidays, cultural holidays, national holidays, and Snap-Days. Prophet allows the user to add a "upper_window" and "lower_window" which extends the effect of the holiday around the holiday date. In the current model, an upper and lower window of 0 days is applied on Snap-Days and an upper and lower window of 1 day is applied on other holidays. Prophet assumes each of the holidays- D_i to be mutually independent [4].
Seasonality s(t): – A Fourier Series is used to model seasonal effects. In the formula given below, P is the regular period (weekly – 7 days, yearly – 365.25 days). Fitting the seasonal parameters, requires fitting 2 N parameters – beta = (a1, b1,… an, bn) [4].
![Fourier Series for seasonality [4]](https://towardsdatascience.com/wp-content/uploads/2020/07/1jyeJgAIqJeMSGfZDRD9pkg.png)
For example for yearly data and N = 10, the seasonal component S(t) = X(t)*beta
![X(t) for N=10 [4]](https://towardsdatascience.com/wp-content/uploads/2020/07/11jNiurxBxYsFKNyGh-eQKw.png)
A smoothing prior – beta ~ Normal(0, sigma²) is imposed on beta. Increasing the number of terms N of the Fourier series increases model complexity and increases the risk of overfitting [4].
Modeling Changepoints: – The parameter for several changepoints can be adjusted using the hyperparameter – "changepoint_prior_scale". This imposes a sparse prior to the changepoint parameters in Prophet. Increasing this parameter increases model flexibility [4].
Implementing Multiprocessing and optimizing runtime
from joblib import Parallel, delayed
submission = Parallel(n_jobs=4,
backend="multiprocessing")(delayed(run_prophet)(comb_lst[i][0],comb_lst[i][1]) for i in range(30490))
model.make_future_dataframe(periods=28, include_history=False)
Model training and prediction in FB Prophet takes longer than an ARIMA model or an Exponential smoothing model. Since we are fitting the model 30490 times at a product-store level, it is necessary to reduce the runtime on an individual series and parallelize the training & prediction of the 30490 series. The former is accomplished by 1) setting "uncertainty_samples = False" in the Prophet() function used in Model 1 & Model 2. This prevents the creation of an uncertainty interval for prediction and 2) setting "include_history=False" in the _make_futuredataframe() function for model prediction (shown above), which prevents Prophet from displaying model fit for the training dataset.
The joblib.Parallel() function is used to implement multiprocessing on fitting and prediction for the 30490 series, as shown in the code snippet above.
Model Evaluation
Accuracy of the point forecasts is evaluated using three metrics – RMSE, RMSSE, WRMSSE. The metric WRMSSE is the metric for evaluation used by Kaggle in the competition.



wi is the weight on each of the 42,840 (includes various levels of aggregation as shown in Fig 1) time series in the dataset. The weights are calculated based on the cumulative dollar sales of the series at a period. Additional details on the weights are given in the competition guide – https://mofc.unic.ac.cy/m5-competition/
Results
The performance of Model 1 and Model 2 is given below. The Average RMSE and Average RMSSE is calculated by computing the mean RMSE or RMSSE across all 30490 product-store time-series. As can be seen, the inclusion of log_price in Model 2 improves performance across all metrics. The performance on the hidden Test dataset is calculated by Kaggle.



The graphs above show the RMSE distribution of the 30490 product-store time-series. As can be seen, the distribution is heavily right-skewed. The performance of both models is similar across all RMSE levels. For more details on the code and implementation kindly refer the GitHub repository –https://github.com/Ritvik29/Walmart-Demand-Prediction
In conclusion, Components in FB- Prophet like seasonality, changepoints, growth, and holidays make it especially suitable for tackling business time-series problems like Demand Forecasting. I would recommend analysts to at-least consider Prophet as a first stop for building Demand Forecasting models.
References
[1] Kaggle M5 Forecasting – Accuracy competition https://www.kaggle.com/c/m5-forecasting-accuracy
[2] Kaggle M5 Forecasting – Accuracy Documentation https://mofc.unic.ac.cy/m5-competition/
[3] Kaggle M5 – Accuracy Forecasting Competition Data https://www.kaggle.com/c/m5-forecasting-accuracy/data
[4] Taylor, Letham (2017), "Forecasting at Scale" https://peerj.com/preprints/3190/
[5] FbProphet – Quickstart https://facebook.github.io/prophet/docs/quick_start.html#python-api