
Time series is a sequence of values ordered in time. We may encounter time series data in pretty much any domain. Weather forecasts, exchange rates, sales data, sound waves are just a few examples. Time series can be any type of data that is represented as an ordered sequence.
In an earlier post, I covered the basic concepts in time series analysis. In this post, we will create time series data with different patterns. One advantage of synthetic datasets is that we can measure the performance of a model and have an idea about how it will perform with real life data.
The common patterns observed in a time series are:
- Trend: An overall upward or downward direction.
- Seasonality: Patterns that repeat observed or predictable intervals.
- White noise: Time series does not always follow a pattern or include seasonality. Some processes produce just random data. This kind of time series is called white noise.
Note: The patterns are not always smooth and usually include some kind of noise. Furthermore, a time series may include a combination of different patterns.
We will use numpy to generate arrays of values and matplotlib to plot the series. Let’s start with importing the required libraries:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
We can define a function that takes the arrays as input and create plots:
def plot_time_series(time, values, label):
plt.figure(figsize=(10,6))
plt.plot(time, values)
plt.xlabel("Time", fontsize=20)
plt.ylabel("Value", fontsize=20)
plt.title(label, fontsize=20)
plt.grid(True)
Trend in Time Series
The first plot is the simplest one which is a time series with an upward trend. We create arrays for time and values with a slope. Then pass these arrays as arguments to our function:
time = np.arange(100)
values = time*0.4
plot_time_series(time, values, "Upward Trend")

Seasonality in Time Series
We can now plot a time series with seasonality. We need a series that repeats the same pattern.
# Just a random pattern
time = np.arange(50)
values = np.where(time < 10, time**3, (time-9)**2)
# Repeat the pattern 5 times
seasonal = []
for i in range(5):
for j in range(50):
seasonal.append(values[j])
# Plot
time_seasonal = np.arange(250)
plot_time_series(time_seasonal, seasonal, label="Seasonality")

This is just a random pattern. Feel free to try out different patterns with numpy.
Noise
Let’s add some noise to the values because we are more likely to work with noisy data rather than smooth curves in real life.
We can create the random noise using np.random.randn function. Then add that noise to the original seasonal series:
noise = np.random.randn(250)*100
seasonal += noise
time_seasonal = np.arange(250)
plot_time_series(time_seasonal, seasonal, label="Seasonality with Noise")

Multiple Patterns
We may see a combination of different patterns in a time series. For example, the following time series contain both an upward trend and seasonality. Ofcourse, there is also some noise.
seasonal_upward = seasonal + np.arange(250)*10
time_seasonal = np.arange(250)
plot_time_series(time_seasonal, seasonal_upward, label="Seasonality + Upward Trend + Noise")

White Noise
Some processes just produce random data that does not follow any pattern. This kind of time series is known as white noise which is very hard to analyze and predict. Let’s create an example of white noise:
time = np.arange(200)
values = np.random.randn(200)*100
plot_time_series(time, values, label="White Noise")

Non-stationary Time Series
Up to now, we have seen time series that follows some pattern continuously. This kind of time series is called stationary. However, life is full of surprises so we may encounter some events that break the pattern and creates non-stationary time series. For example, coronavirus is such a big event that messed up many patterns and businesses are required to update their time series analysis. Let’s create an example:
big_event = np.zeros(250)
big_event[-50:] = np.arange(50)*-50
non_stationary = seasonal_upward + big_event
time_seasonal = np.arange(250)
plot_time_series(time_seasonal, non_stationary, label="Non-stationary Time Series")

We introduced a big event after time point 210 and the effect can be seen afterwards.
Time series analysis is a broad field in Data Science domain. A comprehensive understanding of time series analysis requires knowledge in machine learning, statistics, and ofcourse, domain expertise. In this post, we covered how to create synthetic datasets. We can use these datasets to check the performance of the models we build. In an earlier post, I explained the basic concepts in time series analysis to understand the characteristics of time series and its applications. I plan to continue writing about time series analysis from simple concepts to advance analysis techniques. Stay tuned for the following posts.
Thank you for reading. Please let me know if you have any feedback.