"In probability theory and statistics, a unit root is a feature of some stochastic processes (such as random walks) that can cause problems in statistical inference involving time series models. A linear stochastic process has a unit root if 1 is a root of the process’s characteristic equation. Such a process is non-stationary but does not always have a trend."
Yeah, doesn’t make much sense to me either …
However, the above statement is overcomplicating it, and the unit root in reality is not an overly hard concept to get your head around.
What is Stationarity?
Overview
To understand unit roots, we must first gain a clear understanding of stationarity. I have covered this topic in a previous blog and YouTube video, but we will quickly recap over the key points.
The time series is made stationary through the differencing (stabilise the mean) and the logarithm (stabilise the variance) transform.
See blog or video above for more information on these transformations:
Plot generated by author in Python.
Notice how the mean and variance are roughly constant and don’t change throughout the plot.
Why do we need it?
The requirement for stationarity comes from how we build forecasting models. Most models, such ARIMA, require the data to come from the same distribution so it can fit parameters to the underlying distribution.
If the time series is stationary, it has consistent statistical properties. This means the joint distribution between data points doesn’t change, therefore they can be seen as belonging to the same distribution. Thus, allowing us to fit forecasting models.
Weakly vs Strongly Stationary
The final concept I want to discuss is the difference between weakly and strongly stationary. This is not super important, but very useful to build context around the idea of stationarity.
Weakly: A time series is weakly stationary if the mean and variance are constant.
Strongly: A time series is strongly stationary if the joint distribution of the data points is invariant over time. In other words, they belong to the same distribution.
At first, this concept may seem confusing but let me give you an example. If our time series has a constant mean and variance and is generated from a normal distribution, then it is both weakly and strongly stationary. This is because the normal distribution is a function of two parameters: the mean and variance.
However, there are instances where the mean and variances are constant but the data points can all belong to different distributions. The first data point may be a Poisson distribution, the second may be an Exponential distribution, etc. In this case, it is weakly stationary but not strongly stationary.
See here if you want to learn more about weak vs strong stationarity
Unit Root Explained
Stationarity – Unit Root Link
How does stationarity link to unit-roots? Well, if our time series contains a unit root then it is not stationary.
Most stationarity statistical tests are unit root tests, as they are looking for a unit root to determine if the time series is stationary with a given level of confidence. Test include the augmented dickey-fuller test and the Phillips–Perron test.
A Simple Example
The best way to demonstrate the unit root is through the simple AR(1) model, an autoregressive model with 1 lag:
AR(1). Equation generated by author in LaTeX.
Where:
y: is the time series at different time steps t.
φ: is the coefficient for the first lag term.
ε:is some stochastic noise of identically distributed random variables with mean 0 and constant variance σ².
The above equation can be re-written by expressing the AR(1) model as a MA(∞) (Moving Average) process through recursion:
AR(1) recursion substitution. Equation generated by author in LaTeX.
Carrying on this substitution we get:
AR(1) as MA(∞). Equation generated by author in LaTeX.
The above equation can then be simplified to:
AR(1) as MA(∞) simplified. Equation generated by author in LaTeX.
Now, for the AR(1) equation, it has a unit root when |φ| = 1.
For models with more lags, the unit root becomes harder to define and relies on the characteristic equation(remember we saw this quoted in the Wikipedia definition!). Don’t, worry too much about this as it’s out of the scope of this article, and is not required to grasp the intuition behind the unit root.
So, the equation in the presence of a unit root becomes:
AR(1) with unit root. Equation generated by author in LaTeX.
Some of you may realise this as the famous random walk. The above equation can be re-written in recursive form as:
AR(1) with unit root re-written. Equation generated by author in LaTeX.
Now, the mean (expected value) of the above equation is simply:
Expected value of time series. Equation generated by author in LaTeX.
This is because ε is from the standard normal distribution which has a mean of 0 and variance of σ². Therefore, the expected value of _y_t is then just y_0_. This is good because the mean is constant and we meet that specific requirement of stationarity.
What about the variance?
Well, from the above equation, the value of _y_0 has no variance, but ε_ has a variance of σ². Therefore, the variance is:
We can see that the variance depends on _t. Var(y_1) is σ², Var(y_2) is 2σ²_ etc. So, the variance is getting larger through time.
The variance is increasing through time, so the time series by definition is not stationary!
Therefore, we conclude that the presence of a unit root in our time series renders it non-stationary.
An Intuitive Explanation
Consider the AR(1) equation again:
AR(1). Equation generated by author in LaTeX.
What φ is telling us is how much the value of today depends on the previous value, plus some random noise from the standard normal distribution. If φ < 1, then the time series will naturally revert to the origin if our starting point is _0, y_0 = 0. Multiplying by a number less than 1 will make it smaller and eventually converge to 0_. As it will always revert then it’s predictable and its variance won’t change over time.
In the unit root case, |φ| = 1, then the time series won’t revert to the origin. If we have a good run of positive values, the time series will simply stay there as the next forecast is equal to the last value.
For example, let’s say we have two different AR(1) models:
Equation generated by author in LaTeX.
Equation 1 hasn’t got a unit root but equation 2 does. Let’s say both models start from _y_0 = 0. Now, after some time we reach y_5 = 5_ for both models. But, what will happen next?
As the error ε has a mean of 0, equation 1 will slowly trend back to the origin, so _y_6 = 4, y_7 = 3.2,_ etc. However, equation 2 will simply stay at that level _y_6 = 5, y_7 = 5_ etc.
Therefore, the presence of the unit root makes the time series unpredictable in the long run and affected by ‘shocks’ to the system. This is why we deem it to be non-stationary.
Summary & Further Thoughts
The unit root is a fundamental concept for time series analysis and is arguably the backbone of stationarity, which is the most fundamental requirement when building many forecasting models. A time series has a unit root if any of its solutions from the characteristic equation is 1. This leads to a changing variance through time, that breaks one of the requirements for stationarity. Therefore, most stationarity statistical tests all look for a unit root to decide if the time series is stationary. It is important to note that we don’t need to remember all the maths around the unit root, but rather understand the key concepts and why it leads to non-stationary time series!
I have a free newsletter, Dishing the Data, where I share weekly tips for becoming a better Data Scientist. There is no "fluff" or "clickbait," just pure actionable insights from a practicing Data Scientist.