
Time-series forecasting is one of the important areas of Machine Learning. This is very important when it comes to prediction problems that involve a time component.
These days I am doing my project task based on time series analysis and forecasting. So, I did some research on this area and thought it would be beneficial to me and also people who are getting started with time series forecasting to make those findings in some documented manner.
The purpose of this article is to give an introduction to time series, and basic concepts and modelling techniques related to Time Series Analysis and forecasting.
What is a Time series?
Time series can be defined as a sequence of a metric is recorded over regular time intervals. Depending on the frequency, a time series can be of yearly, quarterly, monthly etc.
There are 2 things which Time-series make different from the regular regression problem. First one is Time-dependent. In linear regression models, observations are independent but in this case, observations depend on time. And there can be Seasonality trends, where variations specific to a particular time frame.
There are 2 methods used for Time Series Forecasting.
- Univariate Time-series Forecasting: only two variables in which one is time and the other is the field to forecast.
- Multivariate Time-series Forecasting: contain multiple variables keeping one variable as time and others will be multiple in parameters.
There are some special characteristics needed to be considered when it comes to time series analysis. They are,
1. Trend
The trend is showing the general tendency of the data to increase or decrease with time.
2. Seasonality
Seasonality in a time series is a regular pattern of changes that repeats over S time periods, where S defines the number of periods until the pattern repeats.
For example, if we think about ice cream sales in several years, there will be a high number of sales in summer and low sales in winter. So, this pattern repeats for particular years. So, the S would be 12.

3. Stationary
Stationarity means that the statistical properties of a time series which are mean, variance and covariance do not change over time.

- In the first plot, mean varies (increases) with time which results in an upward trend.
- In the second plot, no trend in the series, but the variance of the series is a vary over time.
- In the third plot, the spread becomes closer as the time increases, which means that the covariance is varying over time.

In this diagram, all three properties are constant with time which stationary time series looks like.
We can use several methods to identify whether the time series is stationary or not.
- Visual test which identifies the series simply by looking at each plot.
- ADF (Augmented Dickey-Fuller) Test which used to determine the presence of unit root in the series.
- KPSS (Kwiatkowski-Phillips-Schmidt-Shin) Test
Making a Time Series stationary by differencing
In order to use time series forecasting models, first, we need to convert any non-stationary series to a stationary. One method is to use the differencing method. Differencing is simply performed to remove the varying mean.
In this method, the current value is subtracted by the previous value. The complexity of the time-series causes the number of times that the differencing is needed to remove the seasonality.
Seasonal differencing is the difference between a value and a value with lag that is a multiple of S.
The correct order of differencing is the minimum difference required to get a near-stationary series which roughly got a constant mean. And the ACF plot reaches to zero quick as possible.
4. White noise

Simply the "white" means all frequencies are equally represented and "noise" is because there’s no pattern, just random variation.
A time series is a white noise if it distributed with a mean of zero, constant variance and a zero correlation between lags. Gaussian white noise, Binary white noise and Sinusoidal white noise are examples for white noise.
Next, let’s look at what are those ACF and PACF plots.
Auto Correlation Function

This summarizes the strength of the relationship between two variables. We can use the Pearson’s correlation coefficient for this purpose.
The Pearson’s correlation coefficient is a number between -1 and 1 that describes a negative or positive correlation respectively.
We can calculate the correlation for time-series observations with previous time steps, called lags. Since the correlation is calculated with values of the same series at previous times, this is called a serial correlation or autocorrelation.
A plot of the autocorrelation of a time series by lag is called the Auto Correlation Function(ACF) and also this plot is called a correlogram or autocorrelation plot.
Partial Autocorrelation Function

PACF describes the direct relationship between an observation and its lag. It summarizes the relationship between an observation in a time series with observations at prior time steps by removing the relationships of intervening observations.
The autocorrelation for observation and observation at a prior time step is consists of both the direct and indirect correlations. These indirect correlations consist of a linear function of the correlation of the observation at intervening time steps. Partial autocorrelation function removes these indirect correlations.
Now, lets discuss some models used in time series forecasting.
AR model
Auto-Regressive (AR only) model is one where the model depends only on its own lags.

MA model
Moving Average model is one where the model depends only on the lagged forecast errors which are the errors of the AR models of the respective lags.

ARMA model
The Autoregressive-moving average process is the basic model for analyzing a stationary time series. ARMA model is about merging AR and MA models.
AR model explains the momentum and mean reversion effects and MA model captures the shock effects observed in the white noise terms. These shock effects can be considered as unexpected events affecting the observation process such as surprise earnings, wars, attacks, etc.
ARIMA model
Auto-Regressive Integrated Moving Average aka ARIMA is a class of models that based on its own lags and the lagged forecast errors. Any non-seasonal time series that exhibits patterns and is not a random white noise can be modelled with ARIMA models.
ARIMA model is characterized by 3 terms:
- p is the order of the AR term where the number of lags of Y to be used as predictors
- q is the order of the MA term where the number of lagged forecast errors that should go.
- d is the minimum number of differencing needed to make the series stationary.

SARIMA model
In a seasonal ARIMA model, seasonal AR and MA terms predict using data values and errors at times with lags that are multiples of m(the span of the seasonality).

Non-seasonal terms(p,d,q): Can use ACF and PACF plots for this. By examining spikes of early lags ACF indicates MA term(q). Similarly, PACF indicates the AR term(p).
Seasonal terms(P, D, Q and m): For this need to examine the patterns across lags that are multiples of m. Most cases first two or three seasonal multiples would be enough for this. Use the ACF and PACF the same way.
Apart from those models discussed above, there are some more models like Vector Autoregression (VAR), ARCH/GARCH Model, LSTMs etc.
Conclusion
The most important use of time series analysis and is that it helps us to forecast the future behaviour of a variable based on the past. So, hope you got a basic understanding of what the time series is and what are the basic concepts associated with time series analysis. And the intuition of the AR, MA, ARIMA and SARIMA models.
Thank you for reading!
References
[1] N. Tyagi, Introduction to Time Series Analysis: Time-Series Forecasting Machine learning Methods & Models(2020), Analyticssteps.com
[2] S. Prabhakaran, ARIMA Model – Complete Guide to Time Series Forecasting in Python, machinelearningplus.com
[3] J. Brownlee, A Gentle Introduction to Autocorrelation and Partial Autocorrelation (2017), Machine Learning Mastery
[4] J. Brownlee, White Noise Time Series with Python (2017), Machine Learning Mastery
[5] J. Brownlee, How to Create an ARIMA Model for Time Series Forecasting in Python (2017), Machine Learning Mastery
[6] Auquan, Time Series Analysis for Financial Data IV — ARMA Models(2017), medium.com/auquan