The world’s leading publication for data science, AI, and ML professionals.

How To Correctly Perform Cross-Validation For Time Series

Avoid the common pitfalls in applying cross-validation to time series and forecasting models.

Photo by aceofnet on Unsplash
Photo by aceofnet on Unsplash

Background

Cross-validation is a staple process when building any statistical or machine learning model and is ubiquitous in Data Science. However, for the more niche area of time series analysis and forecasting, it is very easy to incorrectly carry out cross-validation.

In this post, I want to showcase the problem with applying regular cross-validation to time series models and common methods to alleviate the issues. We will also go through an example of using cross-validation for hyperparameter tuning for a time series model in Python.

What Is Cross-Validation?

Cross-validation is a method to determine the best performing model and parameters through training and testing the model on different portions of the data. The most common and basic approach is the classic train-test split. This is where we split our data into a training set that is used to fit our model and then evaluated it on the test set.

This idea can be taken one step further by carrying out the train-test split numerous times by varying the data we train and test on. This process is cross-validation as we are using every row of data for both training and evaluation to ensure we choose the most robust model over all the possible available data.

Below is a visualisation of cross-validation using the kfold sklearn function, where we set n_splits=5, on the US airline passenger volumes dataset:

Data from Kaggle with a CC0 licence.

Plot generated by author in Python.
Plot generated by author in Python.

As we can see, the data has been split 5 times where each split contains a new training and testing dataset to build and evaluate our model upon.

Note: A different approach would be to split into training and test sets, then further split the training set into more training and validation sets. You can then carry out cross validation with the various training and validation sets and get the final model performance on the test set. This is what would happen in practise for most machine learning models.

Time Series Cross Validation

The above cross-validation is not an effective or valid strategy on forecasting models due to their temporal dependency. For time series, we always predict into the future. However, in the above approach we will be training on data that is further in time than the evaluation test data. This is data leakage and should be avoided at all costs.

To overcome this quandary, we need to ensure the test set always has a higher index (the index is usually time for time series data) than the training set. This means our test is always in the future compared to the data our model is fitted on.

A depiction of this new cross-validation approach for time series is shown below using the TimeSeriesSplit sklearn function and our plot_cross_val function that we wrote above:

Plot generated by author in Python.
Plot generated by author in Python.

The test sets are now always more forward in time than the training sets, therefore avoiding any data leakage when building our model.

Hyperparameter Tuning

Cross-validation is frequently used in collaboration with hyperparameter tuning to determine the optimal hyperparameter values for a model. Let’s quickly go over an example of this process, for a forecasting model, in Python.

First plot the data:

Data from Kaggle with a CC0 licence.

Plot generated by author in Python.
Plot generated by author in Python.

The data has a clear trend and high seasonality. A suitable model for this time series would be the _Holt Winters exponential smoothing_ model that incorporates both trend and seasonality components. If you want to learn more about the Holt Winters model, check out my previous post on it here:

Time Series Forecasting with Holt Winters’

In the following code snippet, we tune the seasonal smoothing factor, smoothing_seasonal, using grid search and cross-validation and plot the results:

Plot generated by author in Python.
Plot generated by author in Python.

As we can see, it appears the optimal value of the smoothing_seasonal hyperparameter is 0.8.

In this case we manually carried out grid search cross validation, but many packages can do this for you.

If you want to learn more about the hyperparameter tuning domain, checkout my previous article on using _Bayesian Optimization_ through the Hyperopt package:

Hyperopt Tutorial: Optimise Your Hyperparameter Tuning

Summary and Further Thoughts

In this post we have shown how you can’t just use regular cross-validation on you time series model due to the temporal dependency that causes data leakage. Therefore, when carrying out cross-validation for forecasting models, you must ensure that your test set is always further in time than the training set. This is easily done and many packages also provide functions that help with this approach.

The code in the gists can often be hard to follow due to the flow of the article, therefore I recommend checking out the full code at my GitHub here:

Medium-Articles/cross_validation.py at main · egorhowell/Medium-Articles

Another Thing!

I have a free newsletter, Dishing the Data, where I share weekly tips for becoming a better Data Scientist. There is no "fluff" or "clickbait," just pure actionable insights from a practicing Data Scientist.

Dishing The Data | Egor Howell | Substack

Connect With Me!

References and Further Reading


Related Articles