The world’s leading publication for data science, AI, and ML professionals.

8 Reasons Why Forecasting Is Hard

Here's what makes forecasting such a thorny task, and how you can cope with these problems

Forecasting is a popular but difficult problem in Data Science.

Challenges arise for several reasons, from non-stationarity to noise and missing values. Tackling these issues may be pivotal for improving forecasting performance.


Introduction

Atime series is a sequence of values ordered by time. The key aspect about these data sets is the temporal dependency among observations. What happened in the past affects how the future unfolds.

Time series represent real-world systems across many applications. Instances appear in domains such as finance, retail, or transportation.

So, time series analysis is a popular topic in data science. It enables professionals to make data-driven decisions.

But, learning from time series is challenging. In this story, I list a few reasons why Forecasting is a difficult task.


1. Non-stationarity

Stationarity is a central concept in time series. A time series is stationary if its properties (e.g. the mean level) don’t change over time. As I put it in a previous post: the observations do not depend on the time they are observed.

Many existing methods work under the assumption that time series are stationary. But, things like trend or seasonality break stationarity.

Transforming the time series can reduce this problem. For example, differencing helps to stabilise the level of the series. Taking the log stabilises the variance.

There are several statistical tests to check whether a time series is stationary. These include the Augmented Dickey-Fuller, Phillips-Perron, or KPSS tests.

2. Interest in Multiple Horizons

Forecasting is often defined as predicting the next value of time series.

But, predicting many values in advance has important practical advantages. It reduces long-term uncertainty, thereby enabling better operations planning.

Predicting further into the future entails an increased uncertainty. So, forecasting becomes more difficult for longer horizons.

In a previous post, I describe 6 different approaches for multi-step ahead forecasting.

3. Interest in Rare Events

Often, we are interested in predicting rare cases. These are the tails of the distribution.

Take energy production for example. Predicting consumption spikes is critical for managing the grid’s supply and demand.

Often, rare events bear major long-term consequences. A canonical example are stock market crashes. These events lead to the financial ruin of many investors.

Rare events may affect the data distribution, thereby rendering the current models obsolete. More on this below on reason #5 (change points).

The main challenge about rare events is that… well, they are rare.

There’s little information about these cases, and how they come about. So, forecasting models have trouble predicting them.

There are a few approaches to improve the prediction of extreme values. These include:

  • Using cost-sensitive models;
  • Leveraging statistical distributions geared towards extremes;
  • Resampling the distribution of the training data.

4. Extra Dependencies and Dimensions

Time series often have extra dependencies besides time.

Spatio-temporal data is a common example. Each observation is correlated in two dimensions. With its own lags (temporal dependency), and the lags of nearby locations (spatial dependency).

Spatio-temporal data is a particular instance of multivariate time series. These time series are represented by more than one variable.

The extra variables may contain invaluable information. So, modelling them may be critical for improving forecasting performance.

5. Change Points

Things change over time. And so does the data distribution of the time series that represent those things.

Significant changes are known as change points. When they happen abruptly, these changes are known as structural breaks. Other times, change happens more slowly. These are referred to as gradual changes.

Change point detection is a well-studied topic in the literature. Check reference [1] for a comprehensive read.

Sometimes the change point is known.

There’s a market crash or war breaks out, which deeply affect how organizations operate. Yet, it’s not clear how one should cope with such change. Old observations are not as useful as before because the distribution has changed. But, there is little information about the new distribution.

Detecting and adapting to changes is important to keep models up-to-date. Monitoring the performance of these models is a good practice to detect changes.

6. Low Signal-to-Noise Ratio

Loosely speaking, signal-to-noise ratio quantifies the predictability of a time series.

Signal is the relevant part of the data. That which you are trying to model and understand. But, this signal is often overshadowed by background noise -Seemingly random unpredictable fluctuations.

Sometimes, this background noise is simply lack of knowledge. We don’t know which factors affect the data. Or maybe these factors are difficult to quantify. So, series’ movements appear to be random.

Financial data is a notorious example where low signal-to-noise ratio is prevalent.

7. Noise and Missing Values

Noise can stem from deficient data collection.

Real-world systems are plagued by noise and missing values. Figure 2 depicts this problem. It illustrates the bio-signals of an hospital patient. The raw variables are erratic. But, after applying a local regression (LOESS), their dynamics become clearer.

Noise and missing values may arise due to faulty equipment. A sensor fails causing missing data. Or there’s interference, which leads to erroneous readings.

Noise may also occur due to mislabeling. This occurs when a human annotator assigns the wrong label to the data.

Adequate preprocessing steps may help enhance the signal of the series. Examples include the Kalman Filter or Exponential Smoothing.

8. Small Sample Size

Sometimes, time series contain a small number of observations. In such cases, algorithms may not have enough data to build adequate models.

This problem may arise due to a low sampling frequency. For example, the time series is only observed monthly or yearly. Or the things they represent occur infrequently (e.g. extreme weather events).

In the retail domain, you may encounter a cold start problem. This refers to cases where there’s little information on a newly launched product.

Lack of data may arise due to change (see point 5. above). If a significant change occurs historical data becomes obsolete. Then, you need new data which reflects the new distribution.

As I wrote in a previous post, Machine Learning models excel if there’s enough data. Otherwise, you should opt for simple solutions.

The problem of lack of data can be mitigated by using global forecasting models. These leverage many time series to build a model. You can learn more about these in a previous story.

Take Aways

In this post I described 8 challenges that you encounter in forecasting tasks. Here’s a summary:

  1. Non-stationarity: when the data properties change over time;
  2. Multi-step ahead forecasting: interest in long-term predictions;
  3. Extreme values: interest in rare cases;
  4. Extra dependencies: extra variables which may be crucial for accurate predictions;
  5. Change: detecting changes in the distribution;
  6. Low signal-to-noise ratio: when the time series has low predictability;
  7. Noise: random fluctuations in the data;
  8. Small samples: when there’s not enough data.

Thank you for reading, and see you in the next story!

Further Readings

[1] Aminikhanghahi, Samaneh, and Diane J. Cook. "A survey of methods for time series change point detection." Knowledge and information systems 51.2 (2017): 339–367.

[2] Karpatne, Anuj, et al. "Machine learning for the geosciences: Challenges and opportunities." IEEE Transactions on Knowledge and Data Engineering 31.8 (2018): 1544–1554.


Related Articles