Time Series from Scratch

Today you’ll learn one of the most important concepts in time series – stationarity. You’ll learn how to tell if a dataset is stationary, how to test for stationarity and how to automate the testing process.
The article is structured as follows:
- Introduction to stationarity
- ADF test – How to test for stationarity
- Automating stationarity tests
- Conclusion
Introduction to stationarity
The previous article in the series briefly touched on the idea of stationarity. Today you’ll learn all there is to the topic. Let’s start by examining one of the formal definitions:
A stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time. (Source: Wikipedia)
The above definition tells you what weak-form stationarity is. That’s the only form you should care about in time series analysis.
The other form is strict stationarity. It implies that samples of identical size have identical distribution. It is very restrictive, so you won’t see it often in practice.
A time series has to satisfy the following conditions to be considered stationary:
- Constant mean – average value doesn’t change over time.
- Constant variance – variance doesn’t change over time.
- Constant covariance – covariance between periods of identical length doesn’t change over time.
You can test for stationarity with statistical tests, but sometimes plotting a time series can give you a rough estimate. Here’s an image showing stationary vs. non-stationary series:

A stationary series is centered around some value, doesn’t have too many spikes and unexpected variations, and doesn’t show drastic behavior changes from one part to the other.
You should care about stationarity for two reasons:
- Stationary processes are easier to analyze.
- Most forecasting algorithms assume a series is stationary.
You now know the basic theory behind weak-form stationarity. Let’s cover the tests next.
ADF test – How to test for stationarity
A while back, David Dickey and Wayne Fuller developed a test for stationarity – Dicky-Fuller test. It was improved later and renamed to Augmented Dicky-Fuller test, or ADF test for short.
It boils down to a simple hypothesis testing:
- Null hypothesis (H0) – Time series is not stationary.
- Alternative hypothesis (H1) – Time series is stationary.
In Python, the ADF test returns the following:
- Test statistic
- P-value
- Number of lags used
- 1%, 5%, and 10% critical values
- Estimation of the maximized information criteria (don’t worry about it)
If the returned P-value is higher than 0.05, the time series isn’t stationary. 0.05 is the standard threshold, but you’re free to change it.
Let’s implement the ADF test next. We’ll start with the library imports, dataset loading, and visualization for the airline passengers dataset:
Here’s how the dataset looks like:

It doesn’t look stationary at all, but let’s verify that with a test:
Here are the results:

The P-value is just over 0.99, providing strong evidence that the dataset isn’t stationary. You’ve learned the concept of differencing in the previous articles. Now you’ll use it to calculate the N-th order difference. Here’s how the procedure looks for the first and second order:
And here’s the visualization:

The differenced series looks more promising than the original data, but let’s use the ADF test to verify that claim:
Here’s the output:

The first-order difference didn’t make the time series stationary, at least not at the usual significance level. Second-order differencing did the trick.
You can see how manual testing of different differencing orders can be tedious. That’s why you’ll write an automation function next.
Automating stationarity tests
The automation function will accept the following parameters:
data: pd.Series
– time series values, without the datetime informationalpha: float = 0.05
– significance level, set to 0.05 by defaultmax_diff_order: int = 10
– the maximum time allowed to difference the time series
Python dictionary is returned, containing differencing_order
and time_series
keys. The first one is self-explanatory, and the second one contains the differenced time series.
The function will first check if the series is already stationary. If that’s the case, it’s returned as-is. If not, the ADF test is performed for every differencing order up to max_diff_order
. The function keeps track of P-values and returns the one with the lowest differencing order that’s below the significance level alpha
.
Here’s the entire function:
Let’s now use it to make the airline passengers dataset stationary:
Here’s the visualization:

Just like before, second-order differencing is required to make the dataset stationary. But what if you decide for a different significance level? Well, take a look for yourself:
Here’s the chart:

You’ll have to difference the dataset eight times for the significance level of 0.01. It would be a nightmare to revert, so you should probably stick with a higher significance level.
Conclusion
And there you have it – everything you should know about stationarity. The whole concept will get clearer in a couple of articles when you start with modeling and forecasting. For now, remember that a stationary process is easier to analyze and is required by most forecasting models.
There’s still a couple of things left to cover before forecasting. These include train/test splits, metrics, and evaluations. All of these will be covered in the next article, so stay tuned.
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.
Learn More
- Top 5 Books to Learn Data Science in 2021
- How to Schedule Python Scripts With Cron – The Only Guide You’ll Ever Need
- Dask Delayed – How to Parallelize Your Python Code With Ease
- How to Create PDF Reports With Python – The Essential Guide
- Become a Data Scientist in 2021 Even Without a College Degree
Stay Connected
- Follow me on Medium for more stories like this
- Sign up for my newsletter
- Connect on LinkedIn