The world’s leading publication for data science, AI, and ML professionals.

Facebook’s Prophet + Deep Learning = NeuralProphet

Improving the interpretable Prophet model with the power of Deep Learning

Photo by Drew Beamer on Unsplash
Photo by Drew Beamer on Unsplash

Making Sense of Big Data

While learning about Time Series Forecasting, sooner or later you will encounter the vastly popular Prophet model, developed by Facebook. It gained lots of popularity due to the fact that it provides good performance in terms of accuracy, interpretable results, and – at the same time – it automates a lot of the elements (such as hyperparameter selection or feature engineering) for the user. That is why it’s relatively straightforward to use for both data scientists, as well as people with less technical knowledge.

You can imagine I was positively surprised when I recently stumbled upon a new library for time series forecasting – NeuralProphet. As you can deduce from the name of the library, it’s pretty much good old Prophet on steroids, which in this particular case are Neural Networks. Given I currently work quite a lot with time series, I was eager to give it a go and see how it compares to the regular Prophet.

In this article, I give a brief introduction to what NeuralProphet is and how it actually differs from the classic library. Then, I use both libraries to see how well they can perform the same time series forecasting task. It will definitely be helpful to have some understanding of time series forecasting in order to understand all the terminology. Let’s start!

NeuralProphet

To understand what NeuralProphet is, I will briefly go over the building blocks and explain how all of the pieces fall together.

We should start with autoregressive models, which are a class of models that attempt to predict future values of a variable based on its past observations. In general, they are linear models (think linear regression) and for inputs, we use lagged values of the variable we want to predict. In the simplest possible case, you can think of trying to predict tomorrow’s value of variable X using today’s value. Naturally, there are ways of determining which time-steps we should consider for creating the lagged features, however, this is outside of the scope of this article.

The main advantage of AR models is that they are very interpretable – by inspecting the coefficients of the model we can see how a past time step impacts the prediction. However, this interpretability comes at a price, as these parametric models can turn out to be overly rigid and unable to account for any features outside of autocorrelation. What is more, they do not scale that well for datasets with not only a large number of observations but also potential features.

Source
Source

We can think of Prophet as an extension of the basic AR models. Instead of just using lagged values of the target variable, the model provides an additional form of feature engineering – it applies Fourier series to the input variable. This enables us – the analysts – to further tune the model for better performance and also decompose the results for better interpretability. Prophet comes with many other nice features, some of which we name below:

  • Prophet can leverage additional, external regressors – so not only the lagged values of the target
  • the model can account for the effect of holidays
  • it can automatically detect change-points in the trends – for example, when an increasing trend starts to become decreasing

The last piece of the puzzle are the neural networks. As opposed to the autoregressive models, they are non-parametric, which is of great importance when working with time series data. That is because time series very rarely (if at all) follow a single pattern over extended periods of time. In general, NNs can map any nonlinear function to approximate any continuous function, so they will do their best to fit and approximate the given data. That is also why we often hear about some mind-blowing examples of the use of NNs. However, there are a few issues we should be aware of:

  • as the types of networks developed for predicting sequences (and a time series is also a sequence) were originally designed for natural language processing and computer vision, they require a significant volume of data
  • hyperparameter tuning is definitely not as straightforwards as with parametric models
  • neural networks are most often considered to be black-box models, so to put it very briefly, very close to uninterpretable (though there is a lot of research focused on tackling this issue)

As mentioned in the introduction, the main advantages of using Prophet are good performance, interpretability, and the ease of setup and use. That is what the authors of NeuralProphet also had in mind for their library – to retain all the advantages of Prophet, while improving its accuracy and scalability by introducing an improved backend (PyTorch instead of Stan) and using an Auto-Regressive Network (AR-Net) to combine the scalability of Neural Networks with the interpretability of the AR models. To summarize the AR-Net in one sentence – it is a single layer network that is trained to mimic the AR process in a time series signal, but at a much larger scale than the traditional models.

NeuralProphet vs. Prophet

Having briefly described what NeuralProphet is, I would like to focus now on the differences between the two libraries. Using the documentation as a reference, the main differences are:

  • NeuralProphet uses PyTorch’s gradient descent for optimization, which makes the modeling much faster
  • Time-series autocorrelation is modeled using the Auto-Regressive Network
  • Lagged regressors are modeled using a separate Feed-Forward Neural Network
  • Additionally, the model has configurable non-linear deep layers of the Feed-Forward NNs
  • The model is tuneable to specific forecast horizons (greater than 1)
  • It offers custom losses and metrics

It is important to remember, that at the moment of writing this article, the library is in the beta phase. That means that a lot can still change and not all of the features from the original Prophet library are already implemented, for example, the logistic growth.

Setup

At the moment of writing, the best way to install NeuralProphet is to clone the repository and install it directly from there by running pip install . within the directory. For the most recent information on the installation procedure, please refer to this part of the instruction.

Having installed the libraries, we import them in Python.

Preparing the data

For this example, we will work with the Peyton Manning (an NFL star) dataset. If you are not American, you will probably have to check out the Wikipedia entry to understand the joke in the Prophet documentation. Back on track, the dataset contains log daily views of his Wikipedia entry.

The dataset was originally scraped by the Facebook team to show how Prophet works and is available for use within both Prophet and NeuralProphet.

We load the data by running the following line:

For those who have not worked with Prophet before, to run the model we need two columns with the appropriate names: ds being the date/timestamp column and y being the variable we want to forecast. Given this is a sample dataset, it already comes prepared.

Then, we plot the data to get some basic intuition. Given the ups and downs, I would warrant a guess that the uplifts in popularity are correlated with the NFL seasons, though please correct me if I’m wrong.

df.plot(x='ds', y='y', title='Log daily page views');

To make the example realistic, we will split the dataset into the training and test sets, test both models using the training set, and evaluate on the test set to compare the performance. To keep the case simple, we will take the last 365 observations as the test set. So we will use roughly 7 years of data for training and 1 year for evaluation. This way, we hope to witness both the peak and the valley in the test set.

Prophet

As already mentioned, Prophet makes it really easy for the end-users to obtain the forecasts. Practically, we are done with 4 lines of code. We can briefly go over them. First, we instantiate the model using all the default settings. Then, we train the model using the training data. To obtain the predictions, we need to make a so-called "future dataframe". We specify the number of days we want to forecast ahead, and by default, the method includes the historical data as well. This way, we will get the fitted values (forecasts using data from the training set). Lastly, we obtain the predictions and store them in preds_df_1.

Having trained the model, we plot the predictions vs. actual observations (only from the training set).

Without looking at the actuals over the test period, it is hard to judge the performance. However, the predictions, together with the confidence intervals, look reasonable.

As the last step, we plot the extracted components. It’s a similar process to the standard time series decomposition.

prophet_model.plot_components(preds_df_1);

In the plot, we can observe the trend over the years, together with the weekly and yearly components.

NeuralProphet

Thankfully, the API of NeuralProphet is close to identical to the original Prophet. This way, knowing our way around from the previous example will make it really easy to apply NeuralProphet to the time series forecasting task.

Naturally, we use a different class to instantiate the model. The other differences are explicitly stating the frequency of the data while training the model (the original Prophet was primarily intended for daily data) and mentioning the length of the historical sample we want to include in the "future dataframe". To keep the examples as similar as possible, we include the entire historical frame.

Just as before, we plot the model’s predictions using the following line.

nprophet_model.plot(preds_df_2);

Then, we plot the components. At the time of writing, there are no confidence intervals included in the predictions (in general, there are much fewer columns in preds_df_2 than in preds_df_1, but this will most likely change soon!), but we can include the residuals for the training period.

nprophet_model.plot_components(preds_df_2, residuals=True);

We can also plot the parameters of the model by running.

nprophet_model.plot_parameters();

We only included two plots here, as the remaining two overlap with the ones obtained using the plot_components (weekly and yearly seasonality).

Performance Comparison

Lastly, we compare the performance of the models. To do so, we put all the predictions into the test DataFrame, plot the results, and calculate the Mean Squared Error (MSE).

Running the code generates the following plot.

While it can be difficult to clearly say which forecasts are closer to actuals, calculating the MSE clearly shows the winner – NeuralProphet.

Using default settings for both models, we showed that the improved version of the model performs better. However, I am sure that there is still quite a lot of improvement opportunities, for example, by including external variables, fine-tuning the change-point selection, including not only some holidays, but also days around them, etc. But as the goal of the article was to provide a brief overview of the new library, we will wrap it up here and leave some room for potential future articles with more in-depth content.

Conclusions

NeuralProphet is a very young library (still in the beta phase), and it already has a lot of potential. This can also be inferred by looking at the contributors and their prior experience. The authors successfully improved the original Prophet library by further improving the accuracy of the model and its scalability, while maintaining the ease of use and most importantly the interpretability.

Personally, I would not suggest for everybody to abandon the Prophet ship in favor of NeuralProphet, mostly because the library is still young and being continuously developed, which might be not ideal for putting into the production environment. However, I definitely recommend keeping an eye out for the developments!

You can find the code used for this article on my GitHub. As always, any constructive feedback is welcome. You can reach out to me on Twitter or in the comments.

Found this article interesting? Become a Medium member to continue learning by reading without limits. If you use this link to become a member, you will support me at no extra cost to you. Thanks in advance and see you around!

References:


Related Articles