Forecast all sales items individually in one stroke

Motivation:
We will use Facebook’s Prophet to predict sales for each item. There are already many great tutorials out there that teach how to use Prophet. This post focuses on predicting all sales items individually in one stroke (loop).
Solution:
First, we are going to import the necessary libraries:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from Fbprophet import Prophet
%matplotlib inline
Please make sure you mount your Google Drive first, in case you want to use Google’s Colab:
# To be able to access the data stored in Google Drive, we first need to mount our Google Drive so we can load and save files to it.
from google.colab import drive
drive.mount('/content/gdrive')
#We'll need to put in a token which Google will generate for us as soon as we click on the link.

Then we will have a look at our dataframe (data can be found in my github repository):
data = pd.read_excel('/content/gdrive/MyDrive/DDDDFolder/DDDD.xlsx')
data.head()

So what we have got is Sales (both per amount and value) per Sales Day, Customer (e.g., Customer code 0 is one specific customer), and Sales Item (similar to Customer, Sales Item 0 is one particular Sales Item). We will try to predict future sales amounts by looking at past sales dates. Therefore, we are going to drop the other columns:
dfProphet=data.drop(columns=[ 'Customer','SalesItem','SalesTransactionID','SalesValue'])

Now, we want to group monthly sales. We first need to index the Sales Date:
dfProphet['SalesDate'] = pd.to_datetime(dfProphet['SalesDate'])
dfProphet = dfProphet.set_index('SalesDate')
dfProphet.head()

And afterward, we can group monthly sales, using the first of every month (month start, ms):
dfMonthly = dfProphet.resample('MS').sum()
#M stands for end of month, MS start of month, D daily and w weekly
dfMonthly.head()

We have to rename SalesDate as ds and SalesAmount as y (that’s just how Prophet works).
dfMonthlyProph = dfMonthly.reset_index().dropna()
dfMonthlyProph.columns = ['ds', 'y']
dfMonthlyProph.head()

Let us have a look at the timeline chart:
fig = plt.figure(facecolor='w', figsize=(20, 6))
plt.plot(dfMonthlyProph.ds, dfMonthlyProph.y)

To predict, we need to instantiate the model by choosing a seasonality_mode and an interval_width, as well as setting the number of months we want to predict via setting the variable for periods and frequency MS (for month start).
After that, we can simply plot actual and forecasted values by instantiating the model and plotting the forecast:
m = Prophet(seasonality_mode='additive',interval_width=0.95).fit(dfMonthlyProph)
#additive vs multiplicative seasonality, 95% confidence interval
future = m.make_future_dataframe(periods=12, freq='MS')
# vs just periods=365
fcst = m.predict(future)
fig = m.plot(fcst)

Doing the same, but this time dynamically:
from fbprophet.plot import plot_plotly
import plotly.offline as py
py.init_notebook_mode()
fig = plot_plotly(m,fcst)
#py.iplot(fig)
fig.show(renderer="colab")

What is that graph trying to tell us? The blue line is our forecasted sales quantities (what Prophet predicted while trying to fit a model with the lowest overall error on our actual data points). The black dots are our past sales quantities. The blue shade is our confidence interval, meaning that the model would estimate that sales quantity must be within this range. We can see that for May 2017, we face the first sold quantity, which does not fit into the model`s confidence interval.
Doing the loop
Until now, we have forecasted all Sales Items for all Customers. If we wanted to predict a specific Customer and Sales Item combination, we would have to filter beforehand. As a more convenient approach, let us find out how to do this in one Loop instead. We will conduct forecasts for each Sales Item and finally export all the Sales Items’ actual and predicted values into one Excel file:
import pandas as pd
from fbprophet import Prophetdata =
pd.read_excel('/content/gdrive/MyDrive/DDDDFolder/DDDD.xlsx')
#If you should receive an error "future is not defined" when running #with your data, your SalesDate column in Excel might not be #formatted as date.
gData = data.groupby(['SalesItem', 'SalesDate'])['SalesAmount'].sum() #.agg(F.collect_list("SalesAmount")).sort('SalesItem')
gData = gData.to_frame().reset_index()
itemlist = gData.SalesItem.unique()
m = Prophet()
fcst_all = pd.DataFrame() # store all forecasts here
for x in itemlist:
temp = gData[gData.SalesItem == x]
temp = temp.drop(columns=[ 'SalesItem'])
temp['SalesDate'] = pd.to_datetime(temp['SalesDate'])
temp = temp.set_index('SalesDate')
d_df = temp.resample('MS').sum()
d_df = d_df.reset_index().dropna()
d_df.columns = ['ds', 'y']
try:
m = Prophet(seasonality_mode='additive',interval_width=0.95).fit(d_df)
future = m.make_future_dataframe(periods=12, freq='MS')
except ValueError:
pass
fcst = m.predict(future)
fcst['SalesItem'] = x
fcst['Fact'] = d_df['y'].reset_index(drop = True)
fcst_all = pd.concat((fcst_all, fcst))
print(x)
fcst_all.to_excel('ProphetFc.xlsx', index=False)

Congratulations. We now have all actual past values for each Sales Item in one data frame (df), including the predictions. If column fact is filled with y, this record is actual (otherwise predicted).
Many thanks for reading! I hope this article is helpful for you. Feel free to connect with me on LinkedIn, Twitter or Workrooms.
The Jupyter Notebook and data file is also available in my Github repository: https://github.com/DAR-DatenanalyseRehberg/ProphetInALoop