Multiple Time Series Forecast & Demand Pattern Classification using R — Part 1

Syntetos Boylan Croston (SBC) type, Tidy Forecast, Demand Pattern

Published in

Towards Data Science

9 min readJan 23, 2021

Time Series Analysis is a widely used method in business in order to get useful pieces of information such as demand forecasting, identify seasonal products, demand pattern categorization and other characteristics. Here we are going to focus on Time Series forecasting (using Statistical / Machine Learning / Deep Learning model to predict future values) & demand pattern categorization (categorizing products based on quantity and time).

In this blog, I am going to explain how we can fit multiple (1000+) time series models using Statistical (Classical Models), Machine Learning & Deep Learning models, time-series feature engineering & demand pattern categorization. This series will have the following 5 parts:

Part 1: Data Cleaning & Demand categorization.
Part 2: Fit statistical Time Series models (ARIMA, ETS, CROSTON etc.) using fpp3 (tidy forecasting) R Package.
Part 3: Time Series Feature Engineering using timetk R Package.
Part 4: Fit Machine Learning models (XGBoost, Random Forest, etc.) & Hyperparameter tuning using modeltime & tidymodels R packages.
Part 5: Fit Deeplearning models (NBeats & DeepAR) & Hyperparameter tuning using modeltime, modeltime.gluonts R packages.

Let’s get started!

PS: This is not the ONLY method to tackle this problem. However, this is one way to tackle this problem.

The Data

The data I'm using is from the Food Demand Forecasting hackathon in AnalyticsVidhya. The goal of this hackathon is to forecast the number of orders for each meal/centre combos to a food delivery service. We have a total of 3,548 meal/centre combos (i.e. 77 centres & 51 meals), meaning that 3,548-time series models will have to be fitted. This technique in a business environment is also known as Scalable Forecasting.

Let’s import libraries.

pacman::p_load(tidyverse, magrittr) # data wrangling packagespacman::p_load(lubridate, tsintermittent, fpp3, modeltime, timetk, modeltime.gluonts, tidymodels, modeltime.ensemble, modeltime.resample) # time series model packagespacman::p_load(foreach, future) # parallel functionspacman::p_load(viridis, plotly) # visualizations packagestheme_set(hrbrthemes::theme_ipsum()) # set default themes

Now read the train data to fit time series models and submission data to predict future values.

meal_demand_tbl <- read.csv(unz("data/raw/train_GzS76OK.zip", "train.csv")) # reading train datanew_tbl <- read.csv("data/raw/test_QoiMO9B.csv") # the data need to forecastskimr::skim(meal_demand_tbl %>% 
              # remove id
              select(-id) %>% 
              # make center & meal id factors
              mutate(center_id = factor(center_id),
                     meal_id = factor(meal_id))) # summary of data

The Data Preprocessing

In this stage, data preprocessing steps were performed. This data was then transformed to time-series data (i.e. to tsibble object: this is a special type of data that handles time series models in fpp3 package).

The above summary shows that there are 51 types of meals sold in 77 centres, which makes a total of 3,548-time series data, with each time series data consisting of 145 weeks. Here we will need to forecast the number of orders ( num_orders ) for each meal/centre combo. Furthermore, by looking at the column complete_rate , we can see that there are no missing values in variables.

The column weekis in numbers from 1–145, so we will need to change this to dates. We will also remove combos (meal/centre) that did not require forecasting.

date_list <- tibble(id = seq(1, 155, 1),
                    week_date = seq(from = as.Date("2016-01-02"), by = "week", length.out = 155))master_data_tbl <- meal_demand_tbl %>%
  left_join(date_list, by = c("week" = "id")) %>% # joining the date
  inner_join(distinct(new_tbl, meal_id, center_id), by = c("meal_id", "center_id")) %>% # remove combos that did not want to forecast
  select(week_date, num_orders, everything(), -c(week, id))

Now let’s transform the train and submission data into complete data i.e. make irregular time series data to regular time series data by inserting new daterows. These newly created daterows make missing values for num_orders & other variables. Hence, zero was imputed for the variablenum_orders by assuming that no sales occurred on these specific weeks and for the other variables, we replaced them with their corresponding previous week values.

For example, the following time series data (Table 1)shows that after the 4th week there is data missing up to 7th week. Table 2 shows the completed data with the new entries for those missing weeks (i.e. weeks 5 & 6).

Then emailer_for_promotion & homepage_featured variables are transformed into a factor.

master_data_tbl <- master_data_tbl %>%
  as_tsibble(key = c(meal_id, center_id), index = week_date) %>%
  ## num_urders missing value imputation ----
  fill_gaps(num_orders = 0, .full = end()) %>% # make it complete by max week dates
  ## X variables missing value imputation ----
  group_by_key() %>%
  fill_(fill_cols = c("emailer_for_promotion", "homepage_featured", "base_price", "checkout_price")) %>% # filling other variables
  ungroup() %>%
  ## change variables to factor ----
  mutate(emailer_for_promotion = factor(emailer_for_promotion), 
         homepage_featured = factor(homepage_featured))

A similar operation is carried out withsubmission file.

## New Table (Submission file) data wrangling ----
new_tbl <- new_tbl %>%
  left_join(date_list, by = c("week" = "id")) %>% # joining the date
  full_join(new_data(master_data_tbl, n = 10), by = c("center_id", "meal_id", "week_date")) %>%
  as_tsibble(key = c(meal_id, center_id), index = week_date) %>%
  group_by_key() %>%
  fill_(fill_cols = c("emailer_for_promotion", "homepage_featured", "base_price", "checkout_price")) %>% # filling other variables
  ungroup() %>%
  # change variables to factor
  mutate(emailer_for_promotion = factor(emailer_for_promotion), 
         homepage_featured = factor(homepage_featured))

The Time Series Food data Visualizing

Plot 1: Number of orders by Centres

master_data_tbl %>%
  # Randomly Pick 4 Centres
  distinct(center_id) %>% 
  sample_n(4) %>% 
  
  # Joining the transaction data
  left_join(master_data_tbl) %>% 
  group_by(week_date, center_id) %>% # aggregate to centres
  summarise(num_orders = sum(num_orders, na.rm = T)) %>%
  as_tsibble(key = center_id, index = week_date) %>%
  fill_gaps(num_orders = 0, .full = end()) %>%
  autoplot(num_orders) +
  scale_color_viridis(discrete = T)

The above plot shows that the first few weeks of transactions for Center #24 are 0; these transactions have been removed. However, there are continuous transactions after this time period which have been included in the data to fit the model.

master_data_tbl <- master_data_tbl %>%
  filter(center_id != 24) %>%
  bind_rows(master_data_tbl %>%
              filter(center_id == 24 & week_date > as.Date("2016-07-16"))) # remove entries before 2016/07/16 for center 24

Plot 2: Number of Orders by Meal ID’s

master_data_tbl %>%
  # Randomly Pick 4 Meals
  distinct(meal_id) %>%
  sample_n(3) %>%
  # Joining the transaction data
  left_join(master_data_tbl) %>% 
  group_by(week_date, meal_id) %>%
  summarise(num_orders = sum(num_orders, na.rm = T)) %>%
  as_tsibble(key = meal_id, index = week_date) %>%
  fill_gaps(num_orders = 0, .full = end()) %>%
  autoplot(num_orders) +
  scale_color_viridis(discrete = T)

The above plot shows the introduction of new meals, making a shorter time series data. So there is a possibility that these types of time series data should be treated separately by Cross-Validation method.

Demand Categorization — SBC Method.

Now we are going to identify each meal/centre combos demand pattern category (Smooth, Erratic, Lumpy & Intermittent).

Why?

When you are doing forecast on real-world data (i.e. your organization data), you will end up with the majority of your products in a sporadic demand pattern. These type of products show low forecast accuracy and it is difficult to increase their accuracy. This is due to their forecastability being low. So what can we do for these type of products? We must calculate the safety stock value rather than spend time on increasing their forecast accuracy.

Furthermore, in the majority of the time, these sporadic products are not high revenue generators. This will help us differentiate our forecasting project into two parts. For regular demand, products focus on increasing forecast accuracy and sporadic demand products calculate safety stock.

Also, identifying these demand patterns means that different times series models can be applied to them specifically. For example, Croston & SBA are suitable for sporadic demand patterns.

How?

In R we can use the function idclass in the R packagetsintermittent . In the function idclass, when the parameter typeis equal to SBC,this will calculate the following two values:

cv2 — measures the variation in quantities
p — measures the inter-demand interval

Based on these two measures, we can categorize the demand pattern into Smooth (p < 1.32 & cv2 < 0.49), Erratic (p < 1.32 & cv2 ≥ 0.49), Lumpy (p ≥1.32 & cv2 ≥0.49) & Intermittent (p ≥1.32 & cv2 < 0.49).

Smooth:

The smooth demand pattern shows a regular demand and regular time. i.e. This type of products can be sold every day or every week.

Erratic:

The erratic demand pattern shows regularity in time but the selling quantity varied dramatically. i.e. This type of products can be sold every day or every week however, for example, one day it may sell 3 in quantity whereas, another day it could sell 100 in quantity.

Intermittent:

The intermittent demand pattern shows irregularity in time and regularity in quantity pattern. i.e. This type of product is sold for the first week then for several weeks it will not sell but in the end, the same amount of product is sold.

Lumpy:

The lumpy demand pattern shows irregularity in time and irregularity in the quantity pattern. This particular type of demand pattern is difficult to forecast no matter what type of time series models is used. A solution for this type of product is to have a safety stock.

Let’s start the coding!

First, we transform longer format data into wider format i.e. create 3,548 columns (total number of meal/centre time-series data). For example:

# make each combo ts as column
  wide_dt <- .x %>%
    transmute(week_date, id = paste0("x_", center_id, "_", meal_id), num_orders) %>%
    pivot_wider(names_from = id, values_from = num_orders, values_fill = 0) %>%
    arrange(week_date) %>% # arrange by week date
    select(-week_date) %>%
    data.frame()

Then apply idclass to the transformed data frame.

ts_cate_obj <- idclass(wide_dt, type = "SBC", outplot = "none")

ts_cate_obj shown above is a matrix. We will now change this matrix format to thedata.frame and then apply cutoff values to categorize the demand patterns.

ts_categorization <- data.frame(id = row.names(t(wide_dt)),
                                  cv2 = ts_cate_obj$cv2,
                                  p = ts_cate_obj$p) %>%
    separate(id, into = c("x", "center_id", "meal_id"), sep = "_") %>%
    select(-x) %>%
    mutate(demand_cate = case_when(p < 1.32 & cv2 < 0.49 ~ "Smooth",
                                   p >= 1.32 & cv2 < 0.49 ~ "Intermittent",
                                   p < 1.32 & cv2 >= 0.49 ~ "Erratic",
                                   p >= 1.32 & cv2 >= 0.49 ~ "Lumpy")) %>%
    mutate(center_id = as.integer(as.character(center_id)),
           meal_id = as.integer(as.character(meal_id)))

Let’s see the summary of the above analysis in a bar chart.

The above plot shows that in the data, a majority of time series meal/centre combos fall under Smooth & Erratic. This means that a regular time series model such as ARIMA, ETS etc. would have suited well. Also, advanced models such as Croston & SBA have been fitted in order to tackle the intermittent & lumpy demand pattern. Machine learning / deep learning models can also be fitted by using these demand pattern type as a feature. The Cross-Validation method has been used to fit No Demand (i.e. transactions less than 20) combos.

In the following parts, we will do feature engineering, fit time series models, machine learning models, deep learning models and compare their accuracies to find a suitable model.

References

Kourentzes, N., 2014. Intermittent Demand Forecasting Package For R — Nikolaos Kourentzes. Kourentzes.com. Available at: <https://kourentzes.com/forecasting/2014/06/23/intermittent-demand-forecasting-package-for-r/> [Accessed 22 January 2021].

frePPLe. 2021. Demand Classification: Why Forecastability Matters — Frepple APS. Available at: <https://frepple.com/blog/demand-classification/> [Accessed 22 January 2021].