
Note: This article assumes, that you are already familiar with the SARIMAX model. If you’re not, please check it out first.
Introduction
When I was learning time series analysis, many times I heard people saying "If the AR model cannot solve it, try ARIMA".
Then I would hear "If ARIMA doesn’t work, check out for the seasonal component".
That would go on: "If SARIMA doesn’t work, try adding new variables and make it SARIMAX or ARDL".
When that wouldn’t work, I heard "What about cointegration?" And let’s be honest – in a lot of cases, this is enough.
Nevertheless, things get a little tense when you say that none of the above worked. This is the moment when even the most peaceful, kind-hearted econometrician starts getting nervous and says: "If really nothing works, go for a naïve forecast or just assume the mean" or even the saddest sentence you will ever hear: "Okay, then it’s unforecastable".
But is it?
Linearity and nonlinearity
Firstly, it’s worth noting, that all of the models I have mentioned are based on linear regression, and the identification procedures are based on a linear relationship between lagged variables or the ECM term. This approach is obviously correct, however, in some cases it might turn out to be incomplete. I’m going to give you a quick example. I have generated a time series:

And let’s say I’m doing a standard EDA. At some point I would get to the point where I have to check the autocorrelation of the time series:

What would be your first conclusion after seeing this chart? ARIMA’s out, maybe some seasonal model without the autoregressive component?

Well, before going any further I must admit that I tricked you a little bit – the time series I generated is entirely deterministic and depends only on the first lag! In fact, you can generate the same exact time series with the code:
As you probably have noticed by now, the ACF test failed miserably in detecting the true relationship between the data. The only thing we have discovered that is more or less correct from this test is seasonality. In order to understand why this is happening, take a look at this video:
But now, let’s take a look at its lag plot:

The red line is a regression line fit to this dataset. As you can see, the relationship between the moment t and t-1 is quadratic. In other words, how could a linear model possibly be able to forecast it? I’m showing you all of this because I want you to realise, that even in the most friendly nonlinear environment standard Box-Jenkins approach might not work.
How to check Nonlinear Time Series dependence then? One of the approaches (certainly NOT the only one and will NOT work in all the cases) is using the Average Mutual Information:

This surely does seem closer to the truth than what we learnt from the ACF test. Note, that here we can find an indication of seasonality as well. Deterministic time series are hardly ever found in real-life situations (except for physics I guess), so let’s move on to a non-deterministic case. We are going to analyse another time series (artificial dataset, generated with tsDyn). Its ACF and AMI plots look like this:

And its lag plot looks like this:

Can you say at this point if ARIMA will work?

According to the ACF test – there is no autocorrelation, so auto-arima fitted only the intercept. Normally it means, that our forecast is not going to be really fancy. What should we do then? In fact, there are nonlinear models that can deal with this kind of time series, but I wouldn’t like to make this article too technical, so I’m not going to explain its properties right now (maybe in another article?). Regardless, let me compare the results of nonlinear SETAR and linear ARIMA:

The difference between these two is obvious – linear ARIMA wasn’t able to model this time series as well as SETAR. Since the series itself is not autocorrelated, there is no point in checking autocorrelation in the residuals, however, what we can do is compare the AMI between the residuals in the initial time series, the ARIMA residuals and the SETAR residuals:

The SETAR residuals do not have a spike at the first lag anymore as in the initial series and the ARIMA residuals, it is thus the better model. At this point, you could think, that there is a clear difference between nonlinear time series and linear ones. Wrong! While there are cases – like the one above – that cannot be dealt with using the Box-Jenkins approach, most of the nonlinear time series at first look linear. Moreover, many times linear models outperform nonlinear ones! It happens because what we’re doing while forecasting is a minimization of some error function. And depending on which one will be your criterion, you will get to different conclusions. Let me show you another example of a nonlinear time series that is more likely to find than the two above:


If you saw it while doing your analysis, would you say it’s linear or nonlinear? The ACF test shows that there is a significant autocorrelation here. While I can assure you that in fact this time series was generated nonlinear, I cannot give you my word that ARIMA will not forecast it better – depending on what need is to be satisfied. Moreover, when you’re dealing with nonlinear models you’re more prone to incorrect identification and overfitting.
Summary
There are different types of nonlinear relationships in time series and it’s simply not possible to show you all of them in a single article, but I wanted to give you a brief and intuitive introduction to nonlinear time series – and why the most popular approaches in real life can fail miserably. While in most cases standard ARIMA will do just fine, the world of nonlinear dependencies is just so much more complex.
Furthermore, there is not a single procedure that would take care of them all, since the number of permutations of possible nonlinearities that can occur at the same time is nearly infinite. Modern bayesian, semiparametric and nonparametric techniques do offer some help here – and even though they will certainly not do all of the job for us, they can automate some of it and cover a wider area of possible solutions with the data-driven approach.
Don’t forget to follow me on medium!
Also, feel free to contact me on LinkedIn.
Threshold Autoregressive Models – beyond ARIMA + R code
Lasso vs. auto-arima for time series forecasting (out-of-sample)
Regularisation vs. Auto-Arima for time series forecasting (out-of-sample) [experiment]