Just recently, Facebook, in collaboration with researchers at Stanford and Monash University, released a new open-source time-series forecasting library called NeuralProphet. NeuralProphet is an extension of Prophet, a forecasting library that was released in 2017 by Facebook’s Core Data Science Team.
NeuralProphet is an upgraded version of Prophet that is built using PyTorch and uses deep learning models such as AR-Net for time-series forecasting. The main benefit of using NeuralProphet is that it features a simple API inspired by Prophet, but gives you access to more sophisticated deep learning models for time-series forecasting.
How to Use NeuralProphet
Installation
You can install NeuralProphet directly with pip using the command below.
pip install neuralprophet
If you plan to use NeuralProphet in a Jupyter Notebook, you may benefit from installing the live version of NeuralProphet with the following command.
pip install neuralprophet[live]
The live version allows you to visualize the training and validation loss for a model in real-time, which is a pretty powerful feature.
Keep in mind that since the library is very new and still undergoing changes and bug fixes, you may benefit from installing the latest version from GitHub using the following commands. If you experience any errors when using the pip installation, this method will probably fix them.
git clone https://github.com/ourownstory/neural_prophet.git
cd neural_prophet
pip install .
Basic Example
You can find the full code for the examples in this article on GitHub.
I used a dataset with historical weather data for Seattle from 1948 to 2017 that you can find on Kaggle. I used the code below to read the dataset and display the first ten rows.
import pandas as pd
from neuralprophet import NeuralProphet
data = pd.read_csv('./data/seattleWeather_1948-2017.csv')
data.head(10)
In order to train NeuralProphet on a dataset, we need to make sure that the data is formatted so that the date column is named ds and the column with the target variable is named y. In the code below, I rearranged the original dataset to match the format expected by NeuralProphet.
prcp_data = data.rename(columns={'DATE': 'ds', 'PRCP': 'y'})[['ds', 'y']]
Now that the data is in the correct format, we can train and validate a NeuralProphet model with just a few lines of code. The fit function used below uses the following parameters:
- The data used for training/validation.
- validate_each_epoch – a flag indicating whether or not to validate the model’s performance on the validation data in each epoch.
- valid_p – a float between 0 and 1 indicating the proportion of the data that should be used for validation.
- plot_live_loss – a flag indicating whether or not to generate a live plot of the model’s training and validation loss.
- epochs – the number of epochs that the model should be trained for.
model = NeuralProphet()
metrics = model.fit(prcp_data, validate_each_epoch=True,
valid_p=0.2, freq='D',
plot_live_loss=True, epochs=10)
The code above produces the live loss plot below. Notice how the plot and loss values are updated after each epoch.
Once the model is trained, it can be used to make a forecast as demonstrated in the code below. For this example, I used the data to make a 365-day forecast. Keep in mind that this is not a true forecast into the future because the data only contains observations from 1948 to 2017.
We can take this a step further and even generate a five-year forecast.
future = model.make_future_dataframe(prcp_data, periods=365*5)
forecast = model.predict(future)
forecasts_plot = model.plot(forecast)A
Notice how the model was able to learn the seasonal patterns in Seattle’s daily precipitation levels.
Why NeuralProphet is So Powerful
NeuralProphet is definitely a cool tool, but what makes it better than building your own neural network from scratch for time-series forecasting? NeuralProphet is especially powerful when it comes to time-series forecasting because it can take additional information such as trends, seasonality, and recurring events into account. Another great feature of NeuralProphet is that it gives developers access to AR-Net, a simple, yet state-of-the-art neural network for time-series forecasting developed by researchers at Facebook AI.
Trend
With NeuralProphet, we can model trends in time-series data by specifying a few arguments. NeuralProphet allows us to specify the following parameters in the fit method when modeling trends:
- n_changepoints – specifies the number of points where the broader trend (rate of increase/decrease) in the data changes.
- trend_reg – a regularization parameter that controls the flexibility of changepoint selection. Larger values (~1–100) will limit the variability of changepoints. Smaller values (~0.001–1.0) will allow for more variability in changepoints.
To understand how we can model trends in time-series data using NeuralProphet, consider the daily stock price data for the S&P 500 index for the last 10 years.
import pandas_datareader as pdr
from datetime import datetime
import matplotlib.pyplot as plt
%matplotlib inline
start = datetime(2010, 12, 13)
end = datetime(2020, 12, 11)
sp500_data = pdr.get_data_fred('sp500', start, end)
plt.figure(figsize=(10, 7))
plt.plot(sp500_data)
plt.title('S&P 500 Prices')
If we take a look at the graph produced by the code above, we can clearly see that the S&P 500 follows a generally increasing trend, with several points where the price rises or falls sharply. We can think of these points as changepoints. With this idea in mind, we can train a NeuralProphet model to predict the S&P 500 prices, only focusing on the trend for the first version of our model.
sp500_data = sp500_data.reset_index().rename(columns={'DATE': 'ds', 'sp500': 'y'}) # the usual preprocessing routine
model = NeuralProphet(n_changepoints=100,
trend_reg=0.05,
yearly_seasonality=False,
weekly_seasonality=False,
daily_seasonality=False)
metrics = model.fit(sp500_data, validate_each_epoch=True,
valid_p=0.2, freq='D',
plot_live_loss=True,
epochs=100)
The loss plot looks promising and it seems like, after a lot of volatility, the model has converged. We can take visualize the model’s predictions using the plot_forecast function that I defined below.
Using this function, we can visualize the model’s S&P 500 price predictions on historical data and its forecast for the next 60 days as demonstrated below.
plot_forecast(model, sp500_data, periods=60)
It’s very clear that our model has captured the general increasing trend of the S&P 500 index, but the model seems to suffer from underfitting, particularly when we look at the historical data from January 2019 to December 2020 that was likely used for validation. We can take a look at just the model’s forecast without the predictions on the historical data to see what is really going on here.
Based on the graph above, we can see that the model’s forecast for the future seems to follow a straight line. If stocks were this predictable, no one would even think about hiring a financial advisor to manage their portfolio! Fortunately, we can make this model more realistic by adding seasonality parameters to it.
Seasonality
Real-world time-series data often involves seasonal patterns. This is true even for the stock market and trends such as the January effect may appear from year to year. We can make the previous model more realistic by adding yearly seasonality as demonstrated below.
model = NeuralProphet(n_changepoints=100,
trend_reg=0.5,
yearly_seasonality=True,
weekly_seasonality=False,
daily_seasonality=False)
metrics = model.fit(sp500_data, validate_each_epoch=True,
valid_p=0.2, freq='D',
plot_live_loss=True,
epochs=100)
Plotting the model’s predictions on historical data and its forecast for the next two months shows us that this revised model is a bit more realistic.
plot_forecast(model, sp500_data, periods=60, historic_predictions=True)
plot_forecast(model, sp500_data, periods=60, historic_predictions=False, highlight_steps_ahead=60)
Based on the plots above, we can see that this model is a bit more realistic but still suffers from some underfitting. The forecast plot shows a smooth curve that reflects some degree of yearly seasonality, but stocks rarely move so smoothly. Most graphs of stock prices feature several jagged lines. We can capture this volatility in the stock market by using an autoregressive model such as AR-Net.
Using AR-Net
AR-Net is an autoregressive neural network used for time-series forecasting. Autoregressive models use past historical data from previous timesteps to generate predictions for the next timesteps. The values of the target variable in previous timesteps are parameters that serve as inputs for the models. This is where the term autoregressive comes from.
For the purpose of forecasting S&P 500 prices, for example, we can train a model that uses the price of the S&P 500 from the past 60 days to predict the price for the next 60 days. These parameters are specified by the n_lags and n_forecasts arguments in the code below.
model = NeuralProphet(
n_forecasts=60,
n_lags=60,
n_changepoints=100,
yearly_seasonality=True,
weekly_seasonality=False,
daily_seasonality=False,
batch_size=64,
epochs=100,
learning_rate=1.0,
)
model.fit(sp500_data,
freq='D',
valid_p=0.2,
epochs=100)
Plotting the forecast for the AR-Net model demonstrates how much better it really is when it comes to capturing movements in the stock market.
plot_forecast(model, sp500_data, periods=60, historic_predictions=True)
plot_forecast(model, sp500_data, periods=60, historic_predictions=False, highlight_steps_ahead=60)
Based on the forecast plots above, we can see that the AR-Net model generates more realistic predictions for the S&P 500 and manages to capture some of the jagged lines in the movements of the stock market. However, we can improve the model even further by allowing it to take real-world events into account.
Recurring Events
We can also configure the model to take into account the dates of national holidays in the U.S. Some holidays, especially those that lead to increases in online and in-person shopping, may impact the movements of the stock market. We can let the model figure this out by adding one simple line of code before training it.
model = model.add_country_holidays("US", mode="additive", lower_window=-1, upper_window=1)
The window parameters represent the window of influence for the holiday. For example, based on the parameters used above, we will not only consider the impact of the Christmas holiday on stock prices on Christmas Day but also on the days immediately before and after the holiday.
Plotting the forecasts of the refined model demonstrates the predicted impact of holidays on stock prices.
plot_forecast(model, sp500_data, periods=60, historic_predictions=False, highlight_steps_ahead=60)
In the graph above, we can see the model has predicted pronounced shifts in the S&P 500 around the following holidays:
- Christmas Day (December 25)
- New Year’s Day (January 1)
- Martin Luther King Jr. Day (January 18)
Predicting stock prices is an extremely difficult task but this final model seems to do a good job of capturing the general trends in the stock market. Of course, due to the day-to-day volatility of the stock market, I would not recommend using this model for your own day trading but it is still a good demonstration of the capabilities of NeuralProphet.
Summary
NeuralProphet is Facebook’s updated version of Prophet and allows developers to use simple, yet powerful Deep Learning models such as AR-Net for forecasting tasks. What makes NeuralProphet so powerful is its ability to take additional information regarding trends, seasonality, and recurring events into account when generating forecasts.
As I mentioned earlier, you can access the full code for the practical examples in this article on GitHub.
Join my Mailing List
Do you want to get better at data science and machine learning? Do you want to stay up to date with the latest libraries, developments, and research in the data science and machine learning community?
Join my mailing list to get updates on my data science content. You’ll also get my free Step-By-Step Guide to Solving Machine Learning Problems when you sign up! You can also follow me on Twitter for content updates.
And while you’re at it, consider joining the Medium community to read articles from thousands of other writers as well.
Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.
Sources
- O. J. Triebe, N. Laptev, and R. Rajagopal, AR-Net: A SIMPLE AUTO-REGRESSIVE NEURAL NETWORK FOR TIME-SERIES, (2019), arXiv.org.
- Wikipedia, The January Effect, (2020), Wikipedia the Free Encyclopedia.