The world’s leading publication for data science, AI, and ML professionals.

PCA for Multivariate Time Series: Forecasting Dynamic High-Dimensional Data

System Forecasting in Presence of Noise and Serial Correlation

Photo by Viva Luna Studios on Unsplash
Photo by Viva Luna Studios on Unsplash

Multi-step ahead forecasting of multivariate time series is known to be a complex predictive task. We have to take care of the large dimensionality of both inputs and outputs; we have to handle the cross-sectional and temporal dependencies adequately; last but not least, we have to ensure an acceptable level of long-term accuracy.

Nowadays, analytic applications which handle huge data in temporal and cardinality dimensions are very common. Accordingly, all the solutions built on top of these systems must be able to manipulate large datasets. In the Internet of Things (IoT) era, it’s usual to deal with large sets of time series, which show, in most cases, strong correlation patterns. These dynamics happen very often in fields like telecommunication, industrial manufacturing, finance, electric grid, and so on.

Supposing being a data scientist. We are responsible for developing a predictive analytic application that provides multi-step forecasts of an IoT system made by correlated and noisy sensors. Multivariate forecasting is widely discussed in the literature. From statistical techniques like VAR (Vector Auto-Regressive Model) to more sophisticated and recent deep learning-based methodologies, there are a lot of available solutions to carry out a predictive task. However, the real world is more sophisticated and cruel than expected. Managing large sets of high-frequency sensors in real-time requires developing solutions that blend an adequate degree of accuracy with reasonable responsive latency.

In this post, we try to develop a predictive application for multivariate and multi-step sensor forecasting that can be used in a near real-time mode. That’s possible by mixing dimensional reduction with forecasting techniques suited for multivariate contexts. The proposed methodology is popular in the economic forecasting literature and is known as Dynamic Factor Modeling [1]. In other words, we stack on top of the results of dimensional reduction techniques (like PCA) our favorite forecasting algorithm to predict future system dynamics.

EXPERIMENT SETUP

For the scope of the post, we generate multiple synthetic time series. The series can be separated into two groups according to their sinusoidal dynamics. Everything would be perfect except that the signals are hidden by noise.

Synthetic dynamics present in the data [image by the author]
Synthetic dynamics present in the data [image by the author]

We add a good degree of noise to our time series to replicate the chaotic behavior of real-world systems and make the forecasting task more difficult.

Simulated series with noise to forecast [image by the author]
Simulated series with noise to forecast [image by the author]

With multiple time series at our disposal, our scope is to forecast them various steps in advance. Having to carry out a multivariate forecasting task in a near real-time mode we should find a trade-off between prediction accuracy and the duration of the inference process.

Let’s see how we can approach the problem.

MULTIVARIATE DYNAMIC FORECASTING

Dynamic Factor Modeling (DFM) is a technique for multivariate forecasting taken from the economic literature [1]. The basic idea behind DFM is that a small number of series can account for the time behavior of a much larger number of variables. If we can obtain accurate estimates of these factors, the entire forecasting task can be simplified by using the estimated dynamic factors instead of using all series.

Dynamic Factor Modeling estimation flow [image by the author]
Dynamic Factor Modeling estimation flow [image by the author]

The quality of the predictions obtained with DFM depends on two main aspects: the goodness of factors estimation and the accuracy of factors forecasting. There are various ways to estimate dynamic factors. The most common and adopted in the machine learning ecosystem [2] consists in obtaining the principal components of a set of data through orthogonal rotations (PCA).

DFM is also a model-agnostic technique. In other words, any dimensionality reduction and any forecasting strategy can be used to perform forecasting. For our experiment, we use standard PCA and naive direct forecasting. Below is a code snapshot on how to carry out DFM estimation and prediction.

scaler_pca = make_pipeline(StandardScaler(), PCA(n_components=2))

X_train_factors = scaler_pca.fit_transform(X_train)

forecaster = ForecastingChain(
    Ridge(),
    n_estimators=test_size,
    lags=range(1,25),
    use_exog=False,
    n_jobs=-1,
).fit(None, X_train_factors)

y_pred_factors = forecaster.predict(np.arange(test_size))

y_pred = scaler_pca.steps[0][-1].inverse_transform(
    scaler_pca.steps[-1][-1].inverse_transform(y_pred_factors)
)

As a final step, we try to solve the same task by simply adopting multivariate direct forecasting on all the series at our disposal. Regardless of the results achieved, this methodology is not sustainable since it requires lagged feature computations for all the series at our disposal. This may require handling an enormous set of lagged variables which makes forecasting not possible for most systems due to physical (memory) and time limits.

We adopt a temporal cross-validation strategy to validate the results of both methodologies and store performances.

Visual comparison of forecasts of both methodologies [image by the author]
Visual comparison of forecasts of both methodologies [image by the author]

DFM outperforms naive multivariate direct forecasting. It leads to better accuracy in a lower time (inference/estimation time is dependent on the number of system variables). Inspecting the forecasts produced we may observe that DFM can correctly discriminate and reproduce the double sinusoidal dynamics present in the original series.

Comparison of multivariate forecasting performances [image by the author]
Comparison of multivariate forecasting performances [image by the author]

SUMMARY

In this post, we proposed a practical application of Dynamic Factor Modeling. It confirms to be a valid approach to modeling multivariate time series forecasting. It’s particularly suited to forecasting high-dimensional data which also shows a possible high degree level of noise. As always, the perfect forecasting technique that suits all situations doesn’t exist. As data scientists, we are responsible to experiment with techniques we don’t know before. Only with continuous self-learning, we can choose and discriminate the best possible solutions for our daily tasks.


CHECK MY GITHUB REPO

Keep in touch: Linkedin


REFERENCES

[1] M. Forni, M. Hallin, M. Lippi, and L. Reichlin, "The generalized dynamic factor model", Journal of the American Statistical Association, vol. 100, no. 471, pp. 830–840.

[2] G. Bontempi, Y. -A. Le Borgne and J. de Stefani, "A Dynamic Factor Machine Learning Method for Multi-variate and Multi-step-Ahead Forecasting", 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Tokyo, Japan, 2017, pp. 222–231.


Related Articles