Improve your time series analysis with stochastic and deterministic components decomposition

How to decompose your time series into components you can forecast with dynamical systems and stochastic processes methods

Tiago Toledo Jr.
Towards Data Science

--

Photo by m. on Unsplash

In my last post, I talked about how one can use Taken’s Embedding Theorem to forecast deterministic time-series data with standard machine learning algorithms.

What I didn’t talk about in that post was what we could do if the series we are dealing with is not fully deterministic. As can be seen in that post, when the series present some noise, the quality of the prediction drops awfully.

In this post we will analyze and implement the results from the paper ‘Applying Empirical Mode Decomposition and mutual information to separate stochastic and deterministic influences embedded in signals’ [1] that proposes a framework to, given a time-series, decompose it into stochastic and deterministic components.

Being able of generating that decomposition will improve the prediction we can achieve with Taken’s Embedding Theorem and will also allow us to model the stochastic component separately.

The main idea here is to analyze the phase spectrum of the IMFs generated by the Empirical Mode Decomposition method applied to the time series. Let’s understand what this means and how we can apply that to our real-world problems.

Empirical Mode Decomposition

The first step is to divide our time series into what is called Intrinsic Mode Functions (IMFs). For that, we will use a method called Empirical Mode Decomposition (EMD) that will receive as an input a time series and will generate the IMFs from it.

But what is an IMF?

An IMF is a time series by itself that has two met conditions:

  • The difference in the number of maximums and minimums of the series must not be greater than one in absolute
  • The average value of the IMF must be zero

The EMD will generate these IMFs by applying the following algorithm:

  • Find all of the maximums and minimums of the series
  • Interpolate them using a cubic spline to generate what we call an envelope. For those from Computer Science, you can view this as the Big O and Big Omega functions
  • Calculate the average value of this envelope and subtract the original signal from it

We can see what I mean by the above process on this plot:

Envelope creation on the EMD method. Image generated by the author

This new signal we obtained after subtracting the original time series from the average will be tested to verify if it is an IMF. If one or more of the conditions is not met, then the algorithm will repeat itself, but this time on this new candidate instead of the original time series. And will do it until we find an IMF.

Once an IMF is found, the original time series is subtracted from this IMF, generating a new signal which we will call h. Then, we will repeat this entire process on h until the next IMF is found and then subtracted from h and so on.

This process will only stop when we find a series that is monotonic (i.e, it only grows or decreases) or in other words, a series that has only one extreme point (maximum or minimum). This series will be called the residue.

Interpreting the IMFs and Coding it

Because of how this process is done, one can easily see that if we get all of the IMFs and the residue and sum them up, we will end up with the original series.

This is why we call this method a ‘decomposition’. We are decomposing our series into a set of components. More than that, the first IMFs will hold the noisiest signals while the latter ones will present a more deterministic component as we will see below.

Now, let’s code that up. For that, we will need to install a library that already has the EMD implemented called PyEMD. We can easily do that with pip:

pip install EMD-signal

We will also use some free datasets from the sknet library and a function from the sklearn library, so we must install them too:

pip install scikit-learn
pip install sktime

Now, let’s create a function that will return the IMFs given our signal:

import numpy as np
from scipy.fft import fft
import matplotlib.pyplot as plt
from PyEMD import EMD
from sktime.datasets import load_airline, load_shampoo_sales
from sklearn.feature_selection import mutual_info_regression
def emd(signal):
emd = EMD(DTYPE=np.float16, spline_kind='akima')
imfs = emd(signal.values)

t = [i for i in range(len(signal))]
N = imfs.shape[0]
fig, axs = plt.subplots(N + 1, 1, figsize=(15,9))
axs[0].plot(t, signal)
axs[0].set_title('Original Signal')
for n, imf in enumerate(imfs):
axs[n+1].plot(t, imf)
axs[n+1].set_title(f'IMF {n}')

return imfs

In this function, we are just calling the EMD from the PyEMD library and plotting them alongside the original data. Let’s apply it to the Airline Data [2] from sktime, which is free for us to use, and see what happens:

EMD Decomposition for the Airline Dataset [3]. Image generated by the author.

You can see that the IMF 3 in fact behaves like a residue.

Now that we have our IMFs, what can we do with them? How are they useful for us to separate the stochastic from the deterministic component of our time series?

Phase Spectrum

After we generated the IMFs, we can analyze how each of these components behaves on the frequency spectrum. A random signal will have a frequency spectrum all over the place with no clear frequencies. A deterministic signal, on the other hand, will behave better.

One way of visualizing this is to look at the phase spectrum of these components on the frequency space. For that, we must first get the frequency representation of the IMFs with the Fast Fourier Transform (FFT) and then divide the imaginary component by the real component.

Now, we will look at the angle of that phase, this will make it easier for us to see what is happening. To get that, we must apply an arctan function.

Let’s see how this looks in code:

def phase_spectrum(imfs):
imfs_p = []
fig, axs = plt.subplots(len(imfs), 1, figsize=(15,9))
for i, imf in enumerate(imfs):
trans = fft(imf)
imf_p = np.arctan(trans.imag / trans.real)

imfs_p.append(imf_p)

axs[i].plot(imf_p, 'o')
axs[i].set_title(f'IMF {i}')

return imfs_p

We are just applying the FFT from the Scipy library and then doing the calculations I previously described. Finally, we also plot the phases, generating the following image:

Phases from the IMFs. Image generated by the author

One can see that, as we go down on the IMFs towards the residue, a pattern starts to emerge as the angles start to align. This is an indication of deterministic behavior for us.

So, looking at the plot, the two last components seem to be more deterministic than the first two. So we could select them this way and manually define the stochastic and the deterministic components. However, there is a better way of doing that.

Selecting the IMFs with Mutual Information

One way of assessing the relationship between two distributions is mutual information. This metric has the logic of measuring how much information one can get from one variable by observing the other.

Two random signals should have mutual information of zero between them. In case only one of them is deterministic, then this value should also be fairly close to zero. If we have two deterministic signals, then this value should be greater.

Knowing this, we can apply the mutual information on our IMFs and see when it spikes. This is the point where we should start considering the IMFs as the deterministic component.

Let’s code it up:

def phase_mi(phases):
mis = []
for i in range(len(phases)-1):
mis.append(mutual_info_regression(phases[i].reshape(-1, 1), phases[i+1])[0])

return np.array(mis)

Here we leverage the calculation from the sklearn library.

Now, there is a lot of ways of using mutual information. In this post, we will look for a cutoff point.

Dividing the Signal

Now we have everything we need to split our time series into two components. Let’s look at the code:

def divide_signal(signal, imfs, mis, cutoff=0.05):
cut_point = np.where(mis > cutoff)[0][0]
stochastic_component = np.sum(imfs[:cut_point], axis=0)
deterministic_component = np.sum(imfs[cut_point:], axis=0)

t = [i for i in range(len(signal))]

fig, axs = plt.subplots(3, 1, figsize=(15,12))
axs[0].plot(t, signal.values)
axs[0].set_title('Original Signal')

axs[1].plot(t, stochastic_component)
axs[1].set_title('Stochastic Component')

axs[2].plot(t, deterministic_component)
axs[2].set_title('Deterministic Component')

return stochastic_component, deterministic_component

Here we define how much mutual information we will consider as the start of the deterministic component, then we sum the IMFs and plot the components alongside our original signal:

Now, with these two components at hand, we can forecast the deterministic component using the Taken’s Embedding Theorem and the stochastic component using some forecaster such as ARIMA to grasp some deterministic residual that may still be on that component.

Conclusion

Now we have another tool on our toolkit that we can use to improve our time-series analysis. Using this method together with Taken’s Embedding Theorem, we have a good framework to start analyzing and forecasting time-series data.

Of course, no method is perfect and there will be cases where this framework will not yield the best results possible. But that is the beauty of Data Science, there is no correct answer a priori, so we must dive into the problem and find out which way is the best way of moving forward.

Acknowledgments

I must thank here Rodrigo Mello, who first introduced me to these concepts in 2019 during a class at our the university.

[1] Rios, Ricardo Araújo; Mello, Rodrigo Fernandes de (2016). Applying Empirical Mode Decomposition and mutual information to separate stochastic and deterministic influences embedded in signals. Signal Processing, 118(), 159–176. doi:10.1016/j.sigpro.2015.07.003

[2] Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976) Time Series Analysis, Forecasting and Control. Third Edition. Holden-Day. Series G.

--

--